The Community for Technology Leaders
RSS Icon
Issue No.01 - January-February (2011 vol.8)
pp: 226-233
Dukka B. KC , University of North Carolina at Charlotte, Charlotte
Dennis R. Livesay , Univeristy of North Carolina at Charlotte, Charlotte
Prediction of protein functional sites from sequence-derived data remains an open bioinformatics problem. We have developed a phylogenetic motif (PM) functional site prediction approach that identifies functional sites from alignment fragments that parallel the evolutionary patterns of the family. In our approach, PMs are identified by comparing tree topologies of each alignment fragment to that of the complete phylogeny. Herein, we bypass the phylogenetic reconstruction step and identify PMs directly from distance matrix comparisons. In order to optimize the new algorithm, we consider three different distance matrices and 13 different matrix similarity scores. We assess the performance of the various approaches on a structurally nonredundant data set that includes three types of functional site definitions. Without exception, the predictive power of the original approach outperforms the distance matrix variants. While the distance matrix methods fail to improve upon the original approach, our results are important because they clearly demonstrate that the improved predictive power is based on the topological comparisons. Meaning that phylogenetic trees are a straightforward, yet powerful way to improve functional site prediction accuracy. While complementary studies have shown that topology improves predictions of protein-protein interactions, this report represents the first demonstration that trees improve functional site predictions as well.
Phylogenetic motif, functional site prediction, phylogenetic tree, distance matrix.
Dukka B. KC, Dennis R. Livesay, "Topology Improves Phylogenetic Motif Functional Site Predictions", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.8, no. 1, pp. 226-233, January-February 2011, doi:10.1109/TCBB.2009.60
[1] M. Ashburner et al., "Gene Ontology: Tool for the Unification of Biology. The Gene Ontology Consortium," Nature Genetics, vol. 25, pp. 25-29, 2000.
[2] "Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB), Enzyme Supplement 5 (1999)," European J. Biochemistry, vol. 264, pp. 610-650, 1999.
[3] F. Pazos and J. Bang, "Computational Prediction of Functionally Important Regions in Proteins," Current Bioinformatics, vol. 1, pp. 15-23, 2006.
[4] J.D. Watson, R.A. Laskowski, and J.M. Thornton, "Predicting Protein Function from Sequence and Structural Data," Current Opinion in Structural Biology, vol. 15, pp. 275-284, 2005.
[5] W.S. Valdar, "Scoring Residue Conservation," Proteins, vol. 48, pp. 227-241, 2002.
[6] J.A. Capra and M. Singh, "Predicting Functionally Important Residues from Sequence Conservation," Bioinformatics, vol. 23, pp. 1875-1882, 2007.
[7] T. Pupko, R.E. Bell, I. Mayrose, F. Glaser, and N. Ben-Tal, "Rate4Site: An Algorithmic Tool for the Identification of Functional Regions in Proteins by Surface Mapping of Evolutionary Determinants within Their Homologues," Bioinformatics, vol. 18, pp. S71-77, 2002.
[8] S. Jones and J.M. Thornton, "Searching for Functional Sites in Protein Structures," Current Opinion in Structural Biology, vol. 8, pp. 3-7, 2004.
[9] O. Lichtarge, H.R. Bourne, and F.E. Cohen, "An Evolutionary Trace Method Defines Binding Surfaces Common to Protein Families," J. Molecular Biology, vol. 257, pp. 342-358, 1996.
[10] D. La, B. Sutch, and D.R. Livesay, "Predicting Protein Functional Sites with Phylogenetic Motifs," Proteins, vol. 58, pp. 309-320, 2005.
[11] P. Aloy, E. Querol, F.X. Aviles, and M.J. Sternberg, "Automated Structure-Based Prediction of Functional Sites in Proteins: Applications to Assessing the Validity of Inheriting Protein Function from Homology in Genome Annotation and to Protein Docking," J. Molecular Biology, vol. 311, pp. 395-408, 2001.
[12] A.D.S. Mesa, F. Pazos, and A. Valencia, "Automatic Methods for Predicting Functionally Important Residues," J. Molecular Biology, vol. 326, pp. 1289-1302, 2003.
[13] F. Pazos, A. Rausell, and A. Valencia, "Phylogeny-Independent Detection of Functional Residues," Bioinformatics, vol. 22, pp. 1440-1448, 2006.
[14] D.R. Livesay, P. Jambeck, A. Rojnuckarin, and S. Subramaniam, "Conservation of Electrostatic Properties within Enzyme Families and Superfamilies," Biochemistry, vol. 42, pp. 3464-3473, 2003.
[15] D. La and D.R. Livesay, "Predicting Functional Sites with an Automated Algorithm Suitable for Heterogeneous Datasets," BMC Bioinformatics, vol. 6, no. 116, 2005.
[16] D.R. Livesay, P.D. Kidd, S. Eskandari, and U. Roshan, "Assessing the Ability of Sequence-Based Methods to Provide Functional Insight within Membrane Integral Proteins: A Case Study Analyzing the Neurotransmitter/Na+ Symporter Family," BMC Bioinformatics, vol. 8, no. 397, 2007.
[17] D.R. Livesay and D. La, "The Evolutionary Origins and Catalytic Importance of Conserved Electrostatic Networks within TIM-Barrel Proteins," Protein Science, vol. 14, pp. 1158-1170, 2005.
[18] U. Roshan, D.R. Livesay, and D. La, "Improved Phylogenetic Motif Detection Using Parsimony," Proc. Fifth IEEE Int'l Symp. Bioinformatic and Bioeng., pp. 19-26, 2005.
[19] D.B. KC and D.R. Livesay, "Improving Position-Specific Predictions of Protein Functional Sites Using Phylogenetic Motifs," Bioinformatics, vol. 24, pp. 2308-2316, 2008.
[20] D. La and D.R. Livesay, "MINER: Software for Phylogenetic Motif Identification," Nucleic Acids Research, vol. 33, pp. W267-270, 2005.
[21] J.R. Manning, E.R. Jefferson, and G.J. Barton, "The Contrasting Properties of Conservation and Correlated Phylogeny in Protein Functional Residue Prediction," BMC Bioinformatics, vol. 9, no. 51, 2008.
[22] J. Felsenstein, "PHYLIP (Phylogeny Inference Package) Version 3.6," Distributed by the Author, Dept. of Genome Sciences, Univ. of Washington, 2004.
[23] J.D. Thompson, T.J. Gibson, and D.G. Higgins, "Chapter 2: Multiple Sequence Alignment Using ClustalW and ClustalX," Current Protocols in Bioinformatics, pp. pp 2.3.1-2.3.22, Wiley, 2003.
[24] H.A. Schmidt, K. Strimmer, M. Vingron, and A. von Haeseler, "TREE-PUZZLE: Maximum Likelihood Phylogenetic Analysis Using Quartets and Parallel Computing," Bioinformatics, vol. 18, pp. 502-504, 2002.
[25] D.T. Jones, W.R. Taylor, and J.M. Thornton, "The Rapid Generation of Mutation Data Matrices from Protein Sequences," Computer Applications in the Biosciences, vol. 8, pp. 275-282, 1992.
[26] C.T. Porter, G.J. Bartlett, and J.M. Thornton, "The Catalytic Site Atlas: A Resource of Catalytic Sites and Residues Identified in Enzymes Using Structural Data," Nucleic Acids Research, vol. 32, pp. D129-133, 2004.
[27] I.K. McDonald and J.M. Thornton, "Satisfying Hydrogen Bonding Potential in Proteins," J. Molecular Biology, vol. 238, pp. 777-793, 1994.
[28] R.A. Craig and L. Liao, "Improving Protein Protein Interaction Prediction Based on Phylogenetic Information Using a Least-Squares Support Vector Machine," Annals of the New York Academy of Sciences, vol. 1115, pp. 154-167, 2007.
[29] R.A. Craig and L. Liao, "Phylogenetic Tree Information Aids Supervised Learning for Predicting Protein-Protein Interaction Based on Distance Matrices," BMC Bioinformatics, vol. 8, no. 6, 2007.
14 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool