loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2004 IEEE Computational Systems Bioinformatics Conference (CSB'04)
Protein Classification into Domains of Life Using Markov Chain Models
Stanford, California
August 16-August 19
ISBN: 0-7695-2194-0
Francisca Zanoguera, Serono Pharmaceutical Research Institute
Massimo de Francesco, Serono Pharmaceutical Research Institute
It has recently been shown that oligopeptide composition allows clustering proteomes of different organisms into the main domains of life. In this paper, we go a step further by showing that, given a single protein, it is possible to predict whether it has a bacterial or eukaryotic origin with 85% accuracy, and we obtain this result after ensuring that no important homologies exist between the sequences in the test set and the sequences in the training set. To do this, we model the sequence as a Markov chain. A bacterial and an eukaryote model are produced using the training sets. Each input sequence is then classified by calculating the log-odds ratio of the sequence probability for each model. By analyzing the models obtained we extract a set of most discriminant oligopeptides, many of which are part of known functional motifs.
Citation:
Francisca Zanoguera, Massimo de Francesco, "Protein Classification into Domains of Life Using Markov Chain Models," csb, pp.517-519, 2004 IEEE Computational Systems Bioinformatics Conference (CSB'04), 2004
Usage of this product signifies your acceptance of the Terms of Use.