This Article 
 Bibliographic References 
 Add to: 
Space and Time Optimal Parallel Sequence Alignments
December 2004 (vol. 15 no. 12)
pp. 1070-1081

Abstract—We present the first space and time optimal parallel algorithm for the pairwise sequence alignment problem, a fundamental problem in computational biology. This problem can be solved sequentially in O(mn) time and O(m+n) space, where m and n are the lengths of the sequences to be aligned. The fastest known parallel space-optimal algorithm for pairwise sequence alignment takes optimal O\left({{m+n}\over{p}}\right) space, but suboptimal O\left({{(m+n)^2}\over{p}}\right) time, where p is the number of processors. On the other hand, the most space economical time-optimal parallel algorithm takes O\left({{mn}\over{p}}\right) time, but O\left(m+{{n}\over{p}}\right) space. We close this gap by presenting an algorithm that achieves both time and space optimality, i.e. requires only O\left({{m+n}\over{p}}\right) space and O\left({{mn}\over{p}}\right) time. We also present an experimental evaluation of the proposed algorithm on an IBM xSeries cluster. Although presented in the context of full sequence alignments, our algorithm is applicable to other alignment problems in computational biology including local alignments and syntenic alignments. It is also a useful addition to the range of techniques available for parallel dynamic programming.

[1] S. Aluru, N. Futamura, and K. Mehrotra, “Parallel Biological Sequence Comparison Using Prefix Computations,” J. Parallel and Distributed Computing, vol. 63, no. 3, pp. 264-272, 2003.
[2] C.E.R. Alves, E.N. Caceres, and F. Dehne, “Parallel Dynamic Programming for Solving the String Editing Problem on a CGM/BSP,” Proc. ACM Symp. Parallel Algorithms and Architectures, pp. 275-281, 2002.
[3] A. Apostolico, M.J. Atallah, L.L. Larmore, and S. MacFaddin, “Efficient Parallel Algorithms for String Editing and Related Problems,” SIAM J. Computing, vol. 19, no. 5, pp. 968-988, 1990.
[4] M.O. Dayhoff, R. Schwartz, and B.C. Orcutt, “A Model of Evolutionary Change in Proteins: Matrices for Detecting Distant Relationships,” Atlas of Protein Sequence and Structure, vol. 5, M.O. Dayhoff, ed., Nat'l Biomedical Research Foundation, DC, pp. 345-358, 1978.
[5] E.W. Edmiston and R.A. Wagner, “Parallelization of the Dynamic Programming Algorithm for Comparison of Sequences,” Proc. Int'l Conf. Parallel Processing, pp. 78-80, 1987.
[6] E.W. Edmiston, N.G. Core, J.H. Saltz, and R.M. Smith, “Parallel Processing of Biological Sequence Comparison Algorithms,” Int'l J. Parallel Programming, vol. 17, no. 3, pp. 259-275, 1988.
[7] N. Futamura, S. Aluru, and X. Huang, “Parallel Syntenic Alignments,” Proc. Int'l Conf. High Performance Computing, pp. 420-430, 2002.
[8] O. Gotoh, “An Improved Algorithm for Matching Biological Sequences,” J. Molecular Biology, vol. 162, pp. 705-708, 1982.
[9] S. Henikoff and J.G. Henikoff, “Amino Acid Substitution Matrices from Protein Blocks,” Proc. Nat'l Academy of Sciences, vol. 89, pp. 10915-10919, 1992.
[10] D.S. Hirschberg, “A Linear Space Algorithm for Computing Maximal Common Subsequences,” Comm. ACM, vol. 18, no. 6, pp. 341-343, 1975.
[11] A. Grama, V. Kumar, and A. Gupta, Introduction to Parallel Computing, second ed., Reading, Mass.: Addison-Wesley Publishing, 2003.
[12] X. Huang, “A Space-Efficient Parallel Sequence Comparison Algorithm for a Message-Passing Multiprocessor,” Int'l J. Parallel Programming, vol. 18, no. 3, pp. 223-239, 1989.
[13] X. Huang, “A Space-Efficient Algorithm for Local Similarities,” Computer Applications in the Biosciences, vol. 6, no. 4, pp. 373-381, 1990.
[14] X. Huang and K. Chao, “A Generalized Global Alignment Algorithm,” Bioinformatics, vol. 19, no. 2, pp. 228-233, 2003.
[15] E. Lander, J.P. Mesirov, and W. Taylor, “Protein Sequence Comparison on a Data Parallel Computer,” Proc. Int'l Conf. Parallel Processing, pp. 257-263, 1980.
[16] W.J. Masek and M.S. Paterson, “A Faster Algorithm for Computing String Edit Distances,” J. Computer and System Sciences, vol. 20, pp. 18-31, 1980.
[17] E.W. Mayers and W. Miller, “Optimal Alignments in Linear Space,” Computer Applications in the Biosciences, vol. 4, no. 1, pp. 11-17, 1988.
[18] S.B. Needleman and C.D. Wunsch, “A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins,” J. Molecular Biology, vol. 48, pp. 443-453, 1970.
[19] P. Pacheco, Parallel Programming with MPI. San Francisco, Calif.: Morgan Kaufmann Publishers, 1996.
[20] S. Ranka and S. Sahni, “String Editing on an SIMD Hypercube Multicomputer,” J. Parallel and Distributed Computing, vol. 9, pp. 411-418, 1990.
[21] J. Setubal and J. Meidanis, Introduction to Computational Molecular Biology. Boston, Mass.: PWS Publishing Company, 1997.
[22] T.F. Smith and M.S. Waterman, “Identification of Common Molecular Subsequences,” J. Molecular Biology, vol. 147, pp. 195-197, 1981.

Index Terms:
Computational biology, sequence alignments, space-efficient, parallel algorithms, parallel dynamic programming.
Stjepan Rajko, Srinivas Aluru, "Space and Time Optimal Parallel Sequence Alignments," IEEE Transactions on Parallel and Distributed Systems, vol. 15, no. 12, pp. 1070-1081, Dec. 2004, doi:10.1109/TPDS.2004.86
Usage of this product signifies your acceptance of the Terms of Use.