The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.03 - May/June (2011 vol.8)
pp: 775-784
Vladimir Pavlovic , Rutgers University, Piscataway
Bud Mishra , New York University, New York
ABSTRACT
Accurate computational prediction of protein functions increasingly relies on network-inspired models for the protein function transfer. This task can become challenging for proteins isolated in their own network or those with poor or uncharacterized neighborhoods. Here, we present a novel probabilistic chain-graph-based approach for predicting protein functions that builds on connecting networks of two (or more) different species by links of high interspecies sequence homology. In this way, proteins are able to “exchange” functional information with their neighbors-homologs from a different species. The knowledge of interspecies relationships, such as the sequence homology, can become crucial in cases of limited information from other sources of data, including the protein-protein interactions or cellular locations of proteins. We further enhance our model to account for the Gene Ontology dependencies by linking multiple but related functional ontology categories within and across multiple species. The resulting networks are of significantly higher complexity than most traditional protein network models. We comprehensively benchmark our method by applying it to two largest protein networks, the Yeast and the Fly. The joint Fly-Yeast network provides substantial improvements in precision, accuracy, and false positive rate over networks that consider either of the sources in isolation. At the same time, the new model retains the computational efficiency similar to that of the simpler networks.
INDEX TERMS
Biology and genetics, machine learning, bioinformatics (genome or protein) databases.
CITATION
Vladimir Pavlovic, Bud Mishra, "Prediction of Protein Functions with Gene Ontology and Interspecies Protein Homology Data", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.8, no. 3, pp. 775-784, May/June 2011, doi:10.1109/TCBB.2010.15
REFERENCES
[1] http:/www.flybase.org/, 2010.
[2] http:/www.geneontology.org/, 2010.
[3] http:/www.yeastgenome.org/, 2010.
[4] A. Mitrofanova, S. Kleinberg, J. Carlton, S. Kasif, and B. Mishra, "Systems Biology via Redescription and Ontologies (III): Protein Classification Using Malaria Parasite's Temporal Transcriptomic Profiles," Proc. IEEE Int'l Conf. Bioinformatics and Biomedicine, pp. 278-283, 2008.
[5] A. Mitrofanova, V. Pavlovic, and B. Mishra, "Integrative Protein Function Transfer Using Factor Graphs and Heterogeneous Data Sources," Proc. IEEE Int'l Conf. Bioinformatics and Biomedicine, pp. 314-318, 2008.
[6] B.E. Engelhardt, M.I. Jordan, K.E. Muratore, and S.E. Brenner, "Protein Molecular Function Prediction by Bayesian Phylogenomics," PLoS Computational Biology, vol. 1, no. 5, p. e45, 2005.
[7] B. Breitkreutz, C. Stark, and M. Tyers, "The Grid: The General Repository for Interaction Datasets," Genome Biology, vol. 4, no. 3, p. R23, 2003.
[8] S. Carroll and V. Pavlovic, "Protein Classification Using Probabilistic Chain Graphs and the Gene Ontology Structure," Bioinformatics, vol. 22, no. 15, pp. 1871-1878, 2006.
[9] J. Demsar, "Statistical Comparison of Classifiers over Multiple Data Sets," J. Machine Learning Research, vol. 7, pp. 1-30, 2006.
[10] M. Deng, T. Chen, and F. Sun, "An Integrated Probabilistic Model for Functional Prediction of Proteins," Proc. Seventh Int'l Conf. Computational Molecular Biology (RECOMB), pp. 95-103, 2003.
[11] M. Deng, Z. Tu, F. Sun, and T. Chen, "Mapping Gene Ontology to Proteins Based on Protein-Protein Interaction Data," Bioinformatics, vol. 20, no. 6, pp. 895-902, 2004.
[12] S. Geman and D. Geman, "Stochastic Relaxation, Gibbs Distribution and the Bayesian Restoration of Images," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 6, no. 6, pp. 721-741, Nov. 1984.
[13] T. Hawkins, S. Luban, and S. Kihara, "Enhanced Automated Function Prediction Using Distantly Related Sequences and Contextual Association by PFP," Protein Science, vol. 15, pp. 1550-1556, 2006.
[14] E.M. ClarkeJr, O. Grumberg, and D.A. Peled, Model Checking. MIT Press, 1999.
[15] U. Karaoz et al., "Whole-Genome Annotation by Using Evidence Integration in Functional-Linkage Networks," Proc. Nat'l Academy of Sciences USA, vol. 101, pp. 2888-2893, 2004.
[16] S.L. Lauritzen, Graphical Models. Oxford Univ. Press, 1996.
[17] S. Letovsky and S. Kasif, "Predicting Protein Function from Protein/Protein Interaction Data: A Probabilistic Approach," Bioinformatics, vol. 19, no. 1, pp. i197-i204, 2003.
[18] J. Liu and B. Rost, "Comparing Function and Structure between Entire Proteomes," Protein Science, vol. 10, pp. 1970-1979, 2001.
[19] D.M. Martin, M. Berriman, and G.J. Barton, "Gotcha: A New Method for Prediction of Protein Function Assessed by the Annotation of Seven Genomes," BMC Bioinformatics, vol. 5, pp. 178-195, 2004.
[20] N. Yosef, R. Sharan, and N.W. Stafford, "Improved Network-Based Identification of Protein Orthologs," Bioinformatics, vol. 24, no. 16, pp. i200-i206, 2008.
[21] N. Nariai, E. Kolaczyk, and S. Kasif, "Probabilistic Protein Function Prediction from Heterogeneous Genome-Wide Data," PLoS ONE, vol. 2, no. 3, p. e337, 2007.
[22] M. Pruess et al., "The Proteome Analysis Database: A Tool for the In Silico Analysis of Whole Proteomes," Nucleic Acids Research, vol. 31, pp. 414-417, 2003.
[23] B. Schwikowski, P. Uetz, and S. Fields, "A Network of Protein-Protein Interactions in Yeast," Nature Biotechnology, vol. 18, pp. 1257-1261, 2000.
[24] H. Shin, A.M. Lisewski, and O. Lichtarge, "Graph Sharpening Plus Graph Integration: A Synergy that Improves Protein Functional Classification," Bioinformatics, vol. 23, pp. 3217-3224, 2007.
[25] K. Tsuda, H.J. Shin, and B. Scholkopf, "Fast Protein Classification with Multiple Networks," Bioinformatics, vol. 21, pp. ii59-ii65, 2005.
[26] A. Vinayagam, R. Konig, J. Moormann, F. Schubert, R. Eils, K.-H. Glatting, and S. Suhai, "Applying Support Vector Machines for Gene Ontology Based Gene Function Prediction," BMC Bioinformatics, vol. 5, p. 116, 2004.
[27] J. Whisstock and A. Lesk, "Prediction of Protein Function from Protein Sequence and Structure," Quarterly Rev. of Biophysics, vol. 36, pp. 307-340, 2003.
32 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool