This Article 
 Bibliographic References 
 Add to: 
Parallel Genomic Alignments on the Cell Broadband Engine
November 2009 (vol. 20 no. 11)
pp. 1600-1610
Abhinav Sarje, Iowa State University, Ames
Srinivas Aluru, Iowa State University, Ames
Genomic alignments, as a means to uncover evolutionary relationships among organisms, are a fundamental tool in computational biology. There is considerable recent interest in using the Cell Broadband Engine, a heterogeneous multicore chip that provides high performance, for biological applications. However, work in genomic alignments so far has been limited to computing optimal alignment scores using quadratic space for the basic global/local alignment problem. In this paper, we present a comprehensive study of developing alignment algorithms on the Cell, exploiting its thread and data level parallelism features. First, we develop a parallel implementation on the Cell that computes optimal alignments and adopts Hirschberg's linear space technique. The former is essential, as merely computing optimal alignment scores is not useful, while the latter is needed to permit alignments of longer sequences. We then present Cell implementations of two advanced alignment techniques—spliced alignments and syntenic alignments. Spliced alignments are useful in aligning mRNA sequences with corresponding genomic sequences to uncover the gene structure. Syntenic alignments are used to discover conserved exons and other sequences between long genomic sequences from different organisms. We present experimental results for these three types of alignments on 16 Synergistic Processing Elements of the IBM QS20 dual-Cell blade system.

[1] S.B. Needleman and C.D. Wunsch, “A General Method Applicable to the Search for Similarities in Amino Acid Sequence of Two Proteins,” J. Molecular Biology, vol. 48, pp. 443-453, 1970.
[2] T. Smith and M. Waterman, “Identification of Common Molecular Subsequences,” J. Molecular Biology, vol. 147, pp. 195-197, 1981.
[3] M.S. Gelfand, A. Mironov, and P. Pevzner, “Gene Recognition via Spliced Sequence Alignment,” Proc. Nat'l Academy of Sciences of the United States of Am., vol. 93, no. 17, pp. 9061-9066, 1996.
[4] X. Huang and K. Chao, “A Generalized Global Alignment Algorithm,” Bioinformatics, vol. 19, pp. 228-233, 2003.
[5] S. Aluru, Handbook of Computational Molecular Biology (Chapman & All/CRC Computer and Information Science Series). Chapman & Hall/CRC, 2005.
[6] D.S. Hirschberg, “A Linear Space Algorithm for Computing Maximal Common Subsequences,” Comm. ACM, vol. 18, no. 6, pp.341-343, 1975.
[7] T. Rogens and E. Seeberg, “Six-Fold Speed-Up of Smith-Waterman Sequence Database Searches Using Parallel Processing on Common Microprocessors,” Bioinformatics, vol. 16, no. 8, pp. 699-706, 2000.
[8] M. Farrar, “Striped Smith-Waterman Speeds Database Searches Six Times over Other SIMD Implementations,” Bioinformatics, vol. 23, no. 2, pp. 156-161, 2007.
[9] S. Aluru, N. Futamura, and K. Mehrotra, “Parallel Biological Sequence Comparison Using Prefix Computations,” J. Parallel and Distributed Computing, vol. 63, pp. 264-272, 2003.
[10] E.W. Edmiston, N.G. Core, J.H. Saltz, and R.M. Smith, “Parallel Processing of Biological Sequence Comparison Algorithms,” Int'l J. Parallel Programming, vol. 17, no. 3, pp. 259-275, 1988.
[11] N. Futamura, S. Aluru, and X. Huang, “Parallel Syntenic Alignments,” Parallel Processing Letters, vol. 63, no. 3, pp. 264-272, 2003.
[12] The Cell project at IBM Research, IBM Corp.,, 2008.
[13] V. Sachdeva, M. Kistler, E. Speight, and T.-H.K. Tzeng, “Exploring the Viability of the Cell Broadband Engine for Bioinformatics Applications,” Proc. 21st Int'l Parallel and Distributed Processing Symp. (IPDPS '07), pp. 1-8, 2007.
[14] F. Blagojevic, D. Nikolopoulos, A. Stamatakis, and C. Antonopoulos, “Dynamic Multigrain Parallelization on the Cell Broadband Engine,” Proc. 12th ACM SIGPLAN Symp. Principles and Practice of Parallel Programming (PPoPP '07), pp. 90-100, 2007.
[15] H. Vandierendonck, S. Rul, M. Questier, and K. Bosschere, “Experiences with Parallelizing a Bio-informatics Program on the Cell BE,” Proc. High Performance Embedded Architectures and Compilers (HiPEAC '08), vol. 4917, pp. 161-175, Jan. 2008.
[16] A. Wirawan, K.C. Keong, and B. Schmidt, “Parallel DNA Sequence Alignment on the Cell Broadband Engine,” Proc. Workshop Parallel Computational Biology (PBC '07), pp. 1249-1256, 2008.
[17] A. Aji, W. Feng, F. Blagojevic, and D. Nikolopoulos, “Cell-SWat: Modeling and Scheduling Wavefront Computations on the Cell Broadband Engine,” Proc. Conf. Computing Frontiers (CF '08), pp.13-22, 2008.
[18] E.W. Myers and W. Miller, “Optimal Alignments in Linear Space,” Computer Applications in Biosciences, vol. 4, no. 1, pp. 11-17, 1988.
[19] Cell Broadband Engine Resource Center, IBM Corp., cell, 2008.
[20] D. Bader, V. Agarwal, and K. Madduri, “On the Design and Analysis of Irregular Algorithms on the Cell Processor: A Case Study of List Ranking,” Proc. 21st Int'l Parallel and Distributed Processing Symp. (IPDPS '07), pp. 1-10, 2007.
[21] M. Kistler, M. Perrone, and F. Petrini, “Cell Multiprocessor Communication Network: Built for Speed,” IEEE Micro, vol. 26, no. 3, pp. 10-23, 2006.
[22] X. Huang, “A Space-Efficient Algorithm for Local Similarities,” Computer Applications in the Biosciences, vol. 6, no. 4, pp. 373-381, 1990.
[23] BLAST: Basic Local Alignment Search Tool, NCBI, http://www.ncbi.nlm.nih.govblast, 2008.

Index Terms:
Parallel algorithms, biology and genetics, pattern matching.
Abhinav Sarje, Srinivas Aluru, "Parallel Genomic Alignments on the Cell Broadband Engine," IEEE Transactions on Parallel and Distributed Systems, vol. 20, no. 11, pp. 1600-1610, Nov. 2009, doi:10.1109/TPDS.2008.254
Usage of this product signifies your acceptance of the Terms of Use.