The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.09 - September (2010 vol.21)
pp: 1267-1280
Xiandong Meng , Texas A&M University, College Station
Vipin Chaudhary , University at Buffalo, The State University of New York, Buffalo
ABSTRACT
Advances in bioinformatics research continue to add complexity to the analyses and interpretation of biological data. Certain sequence database searches may take weeks to complete due to complicated data dependencies by dynamic programming. A reconfigurable coprocessor can remove this computational bottleneck and accelerate the operation. This paper presents a heterogeneous computing platform through Message Passing Interface (MPI) enabled enterprise computing infrastructure for high-throughput biological sequence analysis. The computing platform integrates heterogeneous computer architectures including conventional processors with Streaming Single Instruction Multiple Data Extensions 2 (SSE2) instructions, reconfigurable coprocessors, and legacy processors together into one system, and allows each to perform the task to which it is best suited. With appropriate computation and communication scheduling, the integrated heterogeneous computing infrastructure is designed to accommodate various types of accelerators to provide a High-Performance Computing (HPC) framework to support the most widely used life science applications.
INDEX TERMS
Index Term—Heterogeneous computing platform, Smith-Waterman algorithm, dynamic programming, sequence alignment, SIMD, SSE2, FPGA, MPI, HPC.
CITATION
Xiandong Meng, Vipin Chaudhary, "A High-Performance Heterogeneous Computing Platform for Biological Sequence Analysis", IEEE Transactions on Parallel & Distributed Systems, vol.21, no. 9, pp. 1267-1280, September 2010, doi:10.1109/TPDS.2009.165
REFERENCES
[1] A.M. Aji, W. Feng, F. Blagojevic, and D.S. Nikolopoulos, "Cell-SWat: Modeling and Scheduling Wavefront Computations on the Cell Broadband Engine," Proc. Fifth ACM Int'l. Conf. Computing Frontiers, pp. 13-22, May 2008.
[2] Alpha-Data, http:/www.alpha-data.com, 2010.
[3] S.F. Altschul, W. Gish, W. Miller, E.W. Myers, and D.J. Lipman, "Basic Local Alignment Search Tool," J. Molecular Biology, vol. 215, pp. 403-410, 1990.
[4] Argonne National Laboratory, MPICH2—A Portable Implementation of MPI, http://www-unix.mcs.anl.gov/mpimpich/, 2010.
[5] Cray XD1, Cray Inc., http:/www.cray.com, 2010.
[6] O. Creţ et al., "FPGA-Based Scalable Implementation of the General Smith-Waterman Algorithm," Proc. IASTED Parallel and Distributed Computing Symp., pp. 410-415, 2004.
[7] A. Darling, L. Carey, and W. Feng, "The Design Implementation, and Evaluation of mpiBLAST," Proc. ClusterWorld Conf. & Expo, 2003.
[8] M. Farrar, "Striped Smith-Waterman Speeds Database Searches Six Times over other SIMD Implementations," Bioinformatics, vol. 23, pp. 156-161, 2007.
[9] FASTA, http:/fasta.bioch.virginia.edu/, 2010.
[10] L. Grate, M. Diekhan, D. Dahle, and H. Hughey, "Sequence Analysis with the Kestrel SIMD Parallel Processor," Pacific Symp. Biocomputing, vol. 6, pp. 263-274, 2001.
[11] O. Gotoh, "An Improved Algorithm for Matching Biological Sequences," J. Molecular Biology, vol. 162, pp. 705-708, 1982.
[12] P. Guerdoux-Jamet and D. Lavenier, "SAMBA: Hardware Accelerator for Biological Sequence Comparison," Bioinformatics, vol. 13, pp. 609-615, 1997.
[13] D.T. Hoang, "Searching Genetic Databases on Splash 2," IEEE Workshop FPGAs for Custom Computing Machines, pp. 185-191, 1993.
[14] IA-32 Intel Architecture Software Developer's Manual, Volume 1: Basic Architecture, 2004.
[15] S. Manavski and G. Valle, "CUDA Compatible GPU Cards as Efficient Hardware Accelerators for Smith-Waterman Sequence Alignment," BMC Bioinformatics, vol. 9, suppl 2, p. S10, 2008.
[16] X. Meng and V. Chaudhary, "Bio-Sequence Analysis with Cradle's 3SoC Software Scalable System on Chip," Proc. ACM Symp. Applied Computing, pp. 202-206, 2004.
[17] X. Meng and V. Chaudhary, "Optimized Fine and Coarse Parallelism for Sequence Homology Search," Int'l J. Bioinformatics Research and Applications, vol. 2, no. 4, pp. 430-441, 2006.
[18] X. Meng and V. Chaudhary, "Improving Data Throughput for FPGA-Based Sequence Database Similarity Searches Using an Adaptive Buffering Scheme," J. Parallel Computing, vol. 35, pp. 1-11, 2009.
[19] NCBI FTP site, ftp://ftp.ncbi.nih.gov/blast/dbFASTA/, 2010.
[20] T. Oliver, B. Schmidt, and D. Maskell, "Hyper Customized Processors for Bio-Sequence Database Scanning on FPGAs," Proc. ACM/SIGDA 13th Int'l Symp. Field-Programmable Gate Arrays, pp. 229-237, 2005.
[21] W.R. Pearson, "Searching Protein Sequence Libraries: Comparison of the Sensitivity and Selectivity of the Smith-Waterman and FASTA Algorithms," Genomics, vol. 11, pp. 635-650, 1991.
[22] Progeniq Pte. Ltd., http:/www.progeniq.com/, 2010.
[23] V. Sachdeva, M. Kistler, E. Speight, and T. Tzeng, "Exploring the Viability of the Cell Broadband Engine for Bioinformatics Applications," Proc. Parallel and Distributed Processing Symp., pp. 1-8, 2007.
[24] B. Schmidt, H. Schroder, and M. Schimmler, "Massively Parallel Solutions for Molecular Sequence Analysis," Proc. Int'l Parallel and Distributed Processing Symp., pp. 186-193, 2002.
[25] I. Sharapov, "Computational Application for Life Sciences on Sun Platforms: Performance Overview," White Paper, 2001.
[26] T.F. Smith and M.S. Waterman, "Identification of Common Molecular Subsequences," J. Molecular Biology, vol. 147, pp. 195-197, 1981.
[27] TimeLogic Corporation, www.timelogic.com, 2010.
[28] Uniref Database, http://www.ebi.ac.ukuniref, 2010.
[29] J. Walter, X. Meng, V. Chaudhary, T. Oliver, L. Yeow, B. Schmidt, D. Nathan, and J. Landman, "MPI-HMMER-Boost: Distributed FPGA Acceleration," J. VLSI Signal Processing Systems for Signal, Image, and Video Technology, vol. 48, no. 3, pp. 223-238, 2007.
[30] M. Warren, E. Weigle, and W. Feng, "Green Density: A 240-Node Beowulf in One Cublic Meter," Proc. Supercomputing (SC), 2002.
[31] A. Wirawan, C.K. Kwon, N.T. Hieu, and B. Schmidt, "CBESW: Sequence Alignment on the Playstation 3," BMC Bioinformatics, vol. 9, p. 377, 2008.
[32] Y. Yamagucchi, T. Maruyama, and A. Konagaya, "High Speed Homology Search with FPGAs," Proc. Seventh Pacific Symp. Biocomputing, vol. 7, pp. 271-282, 2002.
[33] C.W. Yu, K.H. Kwong, K.H. Lee, and P.H.W. Leong, "A Smith-Waterman Systolic Cell," Proc. 13th Conf. Filed-Programmable Logic and Applications, pp. 375-384, 2003.
20 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool