This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Probabilistic Arithmetic Automata and Their Applications
Nov.-Dec. 2012 (vol. 9 no. 6)
pp. 1737-1750
T. Marschall, Life Sci. Group, Centrum Wiskunde & Inf. (CWI), Amsterdam, Netherlands
I. Herms, Fac. of Technol., Bielefeld Univ., Bielefeld, Germany
H. Kaltenbach, Dept. of Biosyst. Sci.e & Eng., Swiss Fed. Inst. of Technol. (ETH), Basel, Switzerland
S. Rahmann, Inst. of Human Genetics, Univ. of Duisburg-Essen, Essen, Germany
We present a comprehensive review on probabilistic arithmetic automata (PAAs), a general model to describe chains of operations whose operands depend on chance, along with two algorithms to numerically compute the distribution of the results of such probabilistic calculations. PAAs provide a unifying framework to approach many problems arising in computational biology and elsewhere. We present five different applications, namely 1) pattern matching statistics on random texts, including the computation of the distribution of occurrence counts, waiting times, and clump sizes under hidden Markov background models; 2) exact analysis of window-based pattern matching algorithms; 3) sensitivity of filtration seeds used to detect candidate sequence alignments; 4) length and mass statistics of peptide fragments resulting from enzymatic cleavage reactions; and 5) read length statistics of 454 and IonTorrent sequencing reads. The diversity of these applications indicates the flexibility and unifying character of the presented framework. While the construction of a PAA depends on the particular application, we single out a frequently applicable construction method: We introduce deterministic arithmetic automata (DAAs) to model deterministic calculations on sequences, and demonstrate how to construct a PAA from a given DAA and a finite-memory random text model. This procedure is used for all five discussed applications and greatly simplifies the construction of PAAs. Implementations are available as part of the MoSDi package. Its application programming interface facilitates the rapid development of new applications based on the PAA framework.
Index Terms:
probabilistic automata,arithmetic,biology computing,enzymes,hidden Markov models,pattern matching,application programming interface,probabilistic arithmetic automata,PAA,computational biology,pattern matching statistics,hidden Markov background models,filtration seeds,enzymatic cleavage reactions,IonTorrent sequencing reads,deterministic arithmetic automata,finite-memory random text model,Hidden Markov models,Automata,Computational modeling,Markov processes,Probabilistic logic,Bioinformatics,dynamic programming.,Probabilistic automaton,text model,hidden Markov model,pattern matching,statistics,clump,string algorithm,analysis of algorithms,alignment seed,peptide mass fingerprinting,DNA sequencing
Citation:
T. Marschall, I. Herms, H. Kaltenbach, S. Rahmann, "Probabilistic Arithmetic Automata and Their Applications," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 9, no. 6, pp. 1737-1750, Nov.-Dec. 2012, doi:10.1109/TCBB.2012.109
Usage of this product signifies your acceptance of the Terms of Use.