The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.03 - May/June (2011 vol.8)
pp: 819-831
Jayendra Gnanaskandan Venkateswaran , University of Florida, Gainesville
Bin Song , University of Florida, Gainesville
Tamer Kahveci , University of Florida, Gainesville
Christopher Jermaine , University of Florida, Gainesville
ABSTRACT
Finding structural similarities in distantly related proteins can reveal functional relationships that can not be identified using sequence comparison. Given two proteins A and B and threshold \epsilon Å, we develop an algorithm, TRiplet-based Iterative ALignment (TRIAL) for computing the transformation of B that maximizes the number of aligned residues such that the root mean square deviation (RMSD) of the alignment is at most \epsilon Å. Our algorithm is designed with the specific goal of effectively handling proteins with low similarity in primary structure, where existing algorithms perform particularly poorly. Experiments show that our method outperforms existing methods. TRIAL alignment brings the secondary structures of distantly related proteins to similar orientations. It also finds larger number of secondary structure matches at lower RMSD values and increased overall alignment lengths. Its classification accuracy is up to 63 percent better than other methods, including CE and DALI. TRIAL successfully aligns 83 percent of the residues from the smaller protein in reasonable time while other methods align only 29 to 65 percent of the residues for the same set of proteins.
INDEX TERMS
Protein structure, tertiary structure, alignment.
CITATION
Jayendra Gnanaskandan Venkateswaran, Bin Song, Tamer Kahveci, Christopher Jermaine, "TRIAL: A Tool for Finding Distant Structural Similarities", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.8, no. 3, pp. 819-831, May/June 2011, doi:10.1109/TCBB.2009.28
REFERENCES
[1] T. Akutsu, "Protein Structure Alignment Using Dynamic Programming and Iterative Improvement," IEICE Trans. Information and Systems, vol. 12, pp. 1629-1636, 1996.
[2] T. Akutsu, "Recognition of Functional Sites in Protein Structures," J. Molecular Biology, vol. 339, nos. 607-633, 2004.
[3] N. Alexandrov and D. Fischer, "Analysis of Topological and Nontopological Structural Similarities in the PDB: New Examples from Old Structures," Proteins, vol. 25, pp. 354-365, 1996.
[4] J. An, T. Nakama, Y. Kubota, H. Wako, and A. Sarai, "Construction of an Integrated Environment for Sequence, Structure, Property and Function Analysis of Proteins," Genome Informatics, vol. 10, pp. 229-230, 1999.
[5] A. Andreeva, A. Prli, T.J.P. Hubbard, and A.G. Murzin, "SISYPHUS: Structural Alignments for Proteins with Non-Trivial Relationships," Nucleic Acids Research, vol. 35, pp. D253-D259, 2007.
[6] K. Arun, T. Huang, and S. Blostein, "Least-Squares Fitting of Two 3D Point Sets," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 9, no. 5, pp. 698-700, Sept. 1987.
[7] J.A. Barker and J.M. Thornton, "An Algorithm for Constraint-Based Structural Template Matching: Application to 3D Templates with Statistical Analysis," Bioinformatics, vol. 19, no. 13, pp. 1644-1649, 2003.
[8] H.M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T.N. Bhat, H. Weissig, I.N. Shindyalov, and P.E. Bourne, "The Protein Data Bank," Nucleic Acids Research, vol. 28, no. 1, pp. 235-242, 2000.
[9] P. Bradley, P.S. Kim, and B. Berger, "Trilogy: Discovery of Sequence-Structure Patterns Across Diverse Proteins," Proc. Ann. Conf. Research in Computational Molecular Biology (RECOMB), pp. 77-88, 2002.
[10] R. Brussel and G. Barton, "Multiple Protein Sequence Alignment from Tertiary Structure Comparison: Assignment of Global and Residue Confidence Levels," Proteins: Structure, Function, and Genetics, vol. 14, pp. 309-323, 1992.
[11] O. Camoglu, T. Kahveci, and A.K. Singh, "Index-Based Similarity Search for Protein Structure Databases," J. Bioinformatics and Computational Biology, vol. 2, no. 1, pp. 99-126, 2004.
[12] B.Y. Chen, V.Y. Fofanov, D.H. Bryant, B.D. Dodson, D.M. Kristensen, A.M. Lisewski, M. Kimmel, O. Lichtarge, and L.E. Kavraki, "The MASH Pipeline for Protein Function Prediction and an Algorithm for the Geometric Refinement of 3D Motifs," J. Computational Biology, vol. 14, no. 6, pp. 791-816, 2007.
[13] D. Fischer, A. Elotsson, D. Rice, and D. Eisenberg, "Assessing the Performance of Fold Recognition Methods by Means of a Comprehensive Benchmark," Proc. Pacific Symp. Biocomputing (PSB), pp. 300-318, 1996.
[14] I. Eidhammer and I. Jonassen, "Protein Structure Comparison and Structure Patterns—An Algorithmic Approach," Intelligent Systems for Molecular Biology (ISMB), tutorial, 2001.
[15] S. Ferr and R. King, "Finding Motifs in Protein Secondary Structure for Use in Function Prediction," J. Computational Biology, vol. 13, pp. 719-731, 2006.
[16] M. Gerstein and M. Levitt, "Using Iterative Dynamic Programming to Obtain Pairwise and Multiple Alignments of Protein Structures," Proc. Intelligent Systems for Molecular Biology (ISMB), pp. 59-66, 1996.
[17] E.J. Gumbel, Statistics of Extremes. Columbia Univ. Press, 1958.
[18] H. Hegyi and M. Gerstein, "The Relationship Between Protein Structure and Function: a Comprehensive Survey with Application to the Yeast Genome," J. Molecular Biology, vol. 288, no. 1, pp. 147-164, 1999.
[19] L. Holm and C. Sander, "Protein Structure Comparison by Alignment of Distance Matrices," J. Molecular Biology, vol. 233, pp. 123-138, 1993.
[20] L. Holm and C. Sander, "3D Lookup: Fast Protein Structure Database Searches at 90 Percent Reliability," Proc. Intelligent Systems for Molecular Biology (ISMB), pp. 179-187, 1995.
[21] I. Koch, T. Lengauer, and E. Wanke, "An Algorithm for Finding Maximal Common Subtopologies in a Set of Protein Structures," J. Computational Biology, vol. 3, no. 2, pp. 289-306, 1996.
[22] B. Kolbeck, P. May, T. Schmidt-Goenner, T. Steinke, and E.-W. Knapp, "Connectivity Independent Protein-Structure Alignment: A Hierarchical Approach," BMC Bioinformatics, vol. 7, pp. 510-530, 2006.
[23] R. Kolodny and N. Linial, "Approximate Protein Structural Alignment in Polynomial Time," Proc. Nat'l Academy of Science USA, vol. 101, pp. 12201-12206, Aug. 2004.
[24] T. Madej, J.-F. Gibrat, and S. Bryant, "Threading a Database of Protein Cores," Proteins, vol. 23, pp. 356-369, 1995.
[25] K. Mizguchi and N. Go, "Comparison of Spatial Arrangements of Secondary Structural Elements in Proteins," Protein Eng., vol. 8, pp. 353-362, 1995.
[26] M. Moll and L. Kavraki, "LabelHash: A Flexible and Extensible Method for Matching Structural Motifs," Proc. Automated Function Prediction meeting (ISMB), 2008.
[27] M. Moll and L. Kavraki, "Matching of Structural Motifs Using Hashing on Residue Labels and Geometric Filtering for Protein Function Prediction," Proc. Computational Systems Bioinformatics Conf. (CSB), 2008.
[28] R. Nussinov and H. Wolfson, "Efficient Detection of Three Dimensional Structural Motifs in Biological Macromolecules by Computer Vision Techniques," Proc. Nat'l Academy of Sciences USA (PNAS), vol. 88, no. 23, pp. 10495-10499, 1991.
[29] J. Rose and F. Eisenmenger, "A Fast Unbiased Comparison of Protein Structures by Means of Needleman-Wunsch Algorithm," J. Molecular Evolution, vol. 32, no. 4, pp. 340-354, 1991.
[30] M. Rossmann and P. Argos, "A Comparison of the Heme Binding Pocket in Globins and Cytochrome b5," J. Bological Chemistry, vol. 250, pp. 7523-7532, 1975.
[31] M. Rossmann and P. Argos, "Exploring Structural Homology of Proreins," J. Molecular Biology, vol. 105, pp. 75-96, 1976.
[32] S. Rufino and T. Blundell, "Structure-Based Identification and Clustering of Protein Families and Superfamilies," J. Computer Aided Molecular Design, vol. 8, pp. 5-27, 1994.
[33] R. Sánchez, U. Pieper, N. Mirkovi, P. de Bakker, E. Wittenstein, and A. Šali, "MODBASE, a Database of Annotated Comparative Protein Structure Models," Nucleic Acids Research, vol. 28, no. 1, pp. 250-253, 2000.
[34] J.M. Sauder, J.W. Arthur, and R.L.D. Jr, "Large-Scale Comparison of Protein Sequence Alignment Algorithms with Structure Alignments," Proteins: Structure, Function, and Genetics, vol. 40, no. 1, pp. 6-22, 2000.
[35] J. Shapiro and D. Brutlag, "FoldMiner: Structural Motif Discovery Using an Improved Superposition Algorithm," Protein Science, vol. 13, no. 1, pp. 278-294, 2004.
[36] I. Shindyalov and P. Bourne, "Protein Structure Alignment by Incremental Combinatorial Extension (CE) of the Optimal Path," Protein Eng., vol. 11, no. 9, pp. 739-747, 1998.
[37] A. Singh and D. Brutlag, "Hierarchical Protein Structure Superposition Using Both Secondary Structure and Atomic Representations," Proc. Intelligent Systems for Molecular Biology (ISMB), vol. 5, pp. 284-293, 1997.
[38] M.A. Soto, A. Sepulveda, and J. Toha-C, "Conservation of the Secondary Structure of Protein during Evolution and the Role of the Genetic Code," Origins of Life and Evolutions of Biosphere, vol. 16, pp. 157-164, 1985.
[39] S. Subbiah, D. Laurents, and M. Levitt, "Structural Similarity of DNA-Binding Domains of Bacteriophage Repressors and the Globin Core," Current Biology, vol. 3, pp. 141-148, 1993.
[40] J.D. Szustakowski and Z. Weng, "Protein Structure Alignment Using a Genetic Algorithm," Proteins, vol. 38, no. 4, pp. 428-440, 2000.
[41] J.D. Szustakowski and Z. Weng, "Protein Structure Alignment Using Evolutionary Computing," Proc. Evolutionary Computation in Bioinformatics, 2002.
[42] W. Taylor, "Protein Structure Comparison Using Iterated Double Dynamic Programming," Protein Science, vol. 8, pp. 654-665, 1999.
[43] W. Taylor and C. Orengo, "Protein Structure Alignment," J. Molecular Biology, vol. 208, pp. 1-22, 1989.
[44] J. Vesterstrom and W.R. Taylor, "Flexible Secondary Structure Based Protein Structure Comparison Applied to the Detection of Circular Permutation," J. Computational Biology, vol. 13, no. 1, pp. 43-63, 2006.
[45] L. Wei and R.B. Altman, "Recognizing Complex, Asymmetric Functional Sites in Protein Structures Using a Bayesian Scoring Function," J. Bioinformatics and Computational Biology, vol. 1, no. 1, pp. 119-138, 2003.
[46] H.J. Wolfson and I. Rigoutsos, "Geometric Hashing: An Overview," IEEE Computational Science and Eng., vol. 4, no. 4, pp. 10-21, Oct-Dec. 1997.
[47] X. Yuan and C. Bystroff, "Non-Sequential Structure-Based Alignments Reveal Topology-Independent Core Packing Arrangements in Proteins," Bioinformatics, vol. 21, no. 7, pp. 1010-1019, 2005.
[48] A. Zemla, "LGA: A Method for Finding 3D Similarities in Protein Structures," Nucleic Acid Research, vol. 31, no. 13, pp. 3370-3374, 2003.
17 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool