18th International Conference on Pattern Recognition (ICPR'06) Volume 3
A Hybrid, Recursive Algorithm for Clustering Expressed Sequence Tags in Chlamydomonas reinhardtii
Hong Kong
August 20-August 24
ISBN: 0-7695-2521-0
Monica Jain, Carnegie Inst. of Washington, 260 Panama Street, Stanford, CA
Hilary Holz, CSU, East Bay, 25800 Carlos Bee Blvd, Hayward, CA, USA
Jeff Shrager, Carnegie Inst. of Washington, 260 Panama Street, Stanford, CA
Olivier Vallon, CNRS/Universit? Paris 6, Institut de Biologie Physico-Chimique, Paris, France
Arthur Grossman, Carnegie Inst. of Washington, 260 Panama Street, Stanford, CA
We present an efficient, fully automated algorithm to assemble ESTs into full-length cDNA sequences that represent the complete coding regions of a gene. Our EST clustering algorithm is neither hierarchical nor incremental, but recursive, processing each EST once. The algorithm exploits a variety of syntactic and statistical features of the ESTs. The resulting assembly shows significant improvement in computational efficiency and information extraction over a previous assembly of C. reinhardtii ESTs. The algorithm was developed using iterative and participatory design on C. reinhardtii; however, it can be used for any organism with a draft genomic sequence.
Citation:
Monica Jain, Hilary Holz, Jeff Shrager, Olivier Vallon, Charles Hauser, Arthur Grossman, "A Hybrid, Recursive Algorithm for Clustering Expressed Sequence Tags in Chlamydomonas reinhardtii," icpr, vol. 3, pp.404-407, 18th International Conference on Pattern Recognition (ICPR'06) Volume 3, 2006