This Article 
 Bibliographic References 
 Add to: 
On Complexity of Protein Structure Alignment Problem under Distance Constraint
March/April 2012 (vol. 9 no. 2)
pp. 511-516
Aleksandar Poleksic, University of Northern Iowa, Cedar Falls
We study the well-known Largest Common Point-set (LCP) under Bottleneck Distance Problem. Given two proteins a and b (as sequences of points in three-dimensional space) and a distance cutoff \sigma, the goal is to find a spatial superposition and an alignment that maximizes the number of pairs of points from a and b that can be fit under the distance \sigma from each other. The best to date algorithms for approximate and exact solution to this problem run in time O(n^8 ) and O(n^{32} ), respectively, where n represents protein length. This work improves runtime of the approximation algorithm and the expected runtime of the algorithm for absolute optimum for both order-dependent and order-independent alignments. More specifically, our algorithms for near-optimal and optimal sequential alignments run in time O(n^7 \log n) and O(n^{14} \log n), respectively. For nonsequential alignments, corresponding running times are O(n^{7.5} ) and O(n^{14.5} ).

[1] J. Moult, K. Fidelis, A. Kryshtafovych, B. Rost, and A. Tramontano, “Critical Assessment of Methods of Protein Structure pre Diction—Round VIII,” Proteins, vol. 77, no. S9, pp. 1-4, 2009.
[2] D. Fischer, L. Rychlewski, R.L. Dunbrack, A.R. Ortiz, and A. Elofsson, “CAFASP3, the Third Critical Assessment of Fully Automated Structure Prediction Methods,” Proteins, vol. 53, no. S6, pp. 503-516, 2003.
[3] I.Y. Koh et al., “EVA, Evaluation of Protein Structure Prediction Servers,” Nucleic Acids Research, vol. 31, pp. 3311-5, 2003.
[4] L. Rychlewski and D. Fischer, “LiveBench-8, the Large-Scale, Continuous Assessment of Automated Protein Structure Prediction,” Protein Science, vol. 14, pp. 240-245, 2005.
[5] X. Pennec and N. Ayache, “An O(n2) Algorithm for 3D Substructure Matching of Proteins,” Proc. First Int'l Workshop Shape and Pattern Matching in Computational Biology, SeattleIn A. Califano, I. Rigoutsos, H.J. Wolson, eds., pp. 25-40, 1994.
[6] N.S. Boutonnet, M.J. Rooman, M.E. Ochagavia, J. Richelle, and S.J. Wodak, “Optimal Protein Structure Alignments by Multiple Linkage Clustering, Application to Distantly Related Proteins,” Protein Eng., vol. 8, pp. 647-662, 1995.
[7] X. Pennec and N. Ayache, “A Geometric Algorithm to Find Small but Highly Similar 3D Substructures in Proteins,” Bioinformatics, vol. 14, pp. 516-522, 1998.
[8] M. Gerstein and M. Levitt, “Using Iterative Dynamic Programming to Obtain Accurate Pairwise and Multiple Alignments of Protein Structures,” Proc. Fourth Int'l Conf. Intelligent Systems for Molecular Biology, pp. 59-67, 1996.
[9] M. Levitt and M. Gerstein, “A Unified Statistical Framework for Sequence Comparison and Structure Comparison,” Proc. Nat'l Academy of Sciences USA, vol. 95, pp. 5913-5920, 1998.
[10] A.P. Singh and D.L. Brutlag, “Hierarchical Protein Structure Superposition Using Both Secondary Structure and Atomic Representations,” Proc. Int'l Conf. Intelligent Systems for Molecular Biology, vol. 5, pp. 284-293, 1997.
[11] S.B. Pandit and J. Skolnick, “Fr-TM-Align: A New Protein Structural Alignment Method Based on Fragment Alignments and the TM-Score,” BMC Bioinformatics, vol. 9, p. 531, 2008.
[12] Z.K. Feng and M.J. Sippl, “Optimum Superposition of Protein Structures, Ambiguities and Implications,” Folding and Design, vol. 1, pp. 123-32, 1996.
[13] D. Goldman, C.H. Papadimitriou, and S. Istrail, “Algorithmic Aspects of Protein Structure Similarity,” Proc. 40th Ann. Symp. Foundations of Computer Science, pp. 512-522, 1999.
[14] A. Caprara, R. Carr, S. Istrail, G. Lancia, and B. Walenz, “1001 Optimal PDB Structure Alignments, Integer Programming Methods for Finding the Maximum Contact Map Overlap,” J. Computational Biology, vol. 11, pp. 27-52, 2004.
[15] Y. Zhang and J. Skolnick, “TM-Align, A Protein Structure Alignment Algorithm Based on the TM-Score,” Nucleic Acids Research, vol. 33, pp. 2302-2309, 2005.
[16] I. Eidhammer, I. Jonassen, and W.R. Taylor, “Structure Comparison and Structure Patterns,” J. Computational Biology, vol. 7, pp. 685-716, 2000.
[17] A. Guerler and E.W. Knapp, “Novel Protein Folds and Their Nonsequential Structural Analogs,” Protein Science, vol. 17, pp. 1374-1382, 2008.
[18] V.A. Ilyin, A. Abyzov, and C.M. Leslin, “Structural Alignment of Proteins by a Novel TOPOFIT Method, as a Superimposition of Common Volumes at a Topomax Point,” Protein Science, vol. 13, pp. 1865-1874, 2004.
[19] T. Akutsu, “Protein Structure Alignment Using Dynamic Programming and Iterative Improvement,” IEICE Trans. Information and Systems, vol. E79-D, no. 12, pp. 1629-1636, 1995.
[20] S.C. Li, D. Bu, J. Xu, and M. Li, “Finding Largest Well-Predicted Subset of Protein Structure Models,” Proc. Ann. Symp. Combinatorial Pattern Matching, pp. 44-55, 2008.
[21] A. Poleksic, “Algorithms for Optimal Protein Structure Alignment,” Bioinformatics, vol. 25, pp. 2751-2756, 2009.
[22] C. Ambühl, S. Chakraborty, and B. Gärtner, “Computing Largest Common Point Sets under Approximate Congruence,” Proc. Ann. European Symp. Algorithms (ESA '00), pp. 52-64, 2000.
[23] A. Poleksic, “Optimizing a Widely Used Protein Structure Alignment Measure in Expected Polynomial Time,” IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 8, no. 6, pp. 1716-1720, Nov./Dec. 2011.
[24] A. Poleksic, “Protein Structure Alignment in Subquadratic Time,” Proc. Bionetics Conf., in press.
[25] A. Poleksic, “Optimal Pairwise Alignment of Fixed Protein Structures in Subquadratic Time,” J. Bioinformatics and Computational Biology, vol. 9, pp. 367-382, 2011.
[26] T.F. Smith and M.S. Waterman, “Identification of Common Molecular Subsequences,” J. Molecular Biology, vol. 147, pp. 195-197, 1981.
[27] R. Kolodny and N. Linial, “Approximate Protein Structural Alignment in Polynomial Time,” Proc. Nat'l Academy of Sciences USA, vol. 101, pp. 12201-12206, 2003.
[28] M.H. Hao, S. Rackovsky, A. Liwo, M.R. Pincus, and H.A. Scheraga, “Effects of Compact Volume and Chain Stiffness on the Conformations of Native Proteins,” Proc. Nat'l Academy of Sciences USA, vol. 89, pp. 6614-6618, 1992.
[29] S. Salem, M.J. Zaki, and C. Bystroff, “Iterative Non-Sequential Protein Structural Alignment,” J. Bioinformatics and Computational Biology, vol. 7, pp. 571-596, 2009.
[30] S. Micali and V.V. Vazirani, “An Algorithm for Finding Maximum Matching in General Graphs,” Proc. 21st IEEE Symp. Foundations of Computer Science, pp. 17-27, 1980.

Index Terms:
Protein structure, structural alignment, structural similarity, alignment algorithms.
Aleksandar Poleksic, "On Complexity of Protein Structure Alignment Problem under Distance Constraint," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 9, no. 2, pp. 511-516, March-April 2012, doi:10.1109/TCBB.2011.133
Usage of this product signifies your acceptance of the Terms of Use.