2007 IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2007)
Computational Identification of Protein-Coding Sequences by Comparative Analysis
Fremont, California
November 02-November 04
ISBN: 0-7695-3031-1
Gene prediction is an essential step in understanding the genome of a species once it has been sequenced. For that, a promising direction in current research on gene finding is a comparative genomics approach. In this paper, we present a novel approach to identifying evolutionarily conserved protein-coding sequences in genomes. The method takes advantage of the specific substitution pattern of coding se- quences together with the consistency of reading frames. It has been implemented in a software called Protea. Large- scale experimentation shows good results. Protea is in- tended to be a useful complement to existing tools based on homology search or statistical properties of the sequences.
Citation:
Arnaud Fontaine, H?l?ne Touzet, "Computational Identification of Protein-Coding Sequences by Comparative Analysis," bibm, pp.95-102, 2007 IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2007), 2007