Machine Learning Techniques for the Automated Classification of Adhesin-Like Proteins in the Human Protozoan Parasite Trypanosoma cruzi
Issue No. 04 - October-December (2009 vol. 6)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TCBB.2008.125
José L. Ramírez , Instituto de Estudios Avanzados, Caracas
Ana M. González , Universidad Autónoma de Madrid, Madrid
José R. Dorronsoro , Universidad Autónoma de Madrid, Madrid
Francisco J. Azuaje , Research Center for Publich Health (CRP-Santé), Luxembourg
José F. da Silveira , Escola Paulista de Medicina, UNIFESP, Brazil
This paper reports on the evaluation of different machine learning techniques for the automated classification of coding gene sequences obtained from several organisms in terms of their functional role as adhesins. Diverse, biologically-meaningful, sequence-based features were extracted from the sequences and used as inputs to the in silico prediction models. Another contribution of this work is the generation of potentially novel and testable predictions about the surface protein DGF-1 family in Trypanosoma cruzi. Finally, these techniques are potentially useful for the automated annotation of known adhesin-like proteins from the trans-sialidase surface protein family in T. cruzi, the etiological agent of Chagas disease.
Chagas disease, adhesin-like proteins, genomic data mining, machine learning.
José L. Ramírez, Ana M. González, José R. Dorronsoro, Francisco J. Azuaje, José F. da Silveira, "Machine Learning Techniques for the Automated Classification of Adhesin-Like Proteins in the Human Protozoan Parasite Trypanosoma cruzi", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 6, no. , pp. 695-702, October-December 2009, doi:10.1109/TCBB.2008.125