The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.04 - April (2011 vol.22)
pp: 529-543
Heshan Lin , Virginia Tech, Blacksburg
Xiaosong Ma , North Carolina State University and Oak Ridge National Laboratory, Raleigh
Wuchun Feng , Virginia Tech, Blacksburg
Nagiza F. Samatova , North Carolina State University and Oak Ridge National Laboratory, Raleigh
ABSTRACT
With the explosive growth of genomic information, the searching of sequence databases has emerged as one of the most computation and data-intensive scientific applications. Our previous studies suggested that parallel genomic sequence-search possesses highly irregular computation and I/O patterns. Effectively addressing these runtime irregularities is thus the key to designing scalable sequence-search tools on massively parallel computers. While the computation scheduling for irregular scientific applications and the optimization of noncontiguous file accesses have been well-studied independently, little attention has been paid to the interplay between the two. In this paper, we systematically investigate the computation and I/O scheduling for data-intensive, irregular scientific applications within the context of genomic sequence search. Our study reveals that the lack of coordination between computation scheduling and I/O optimization could result in severe performance issues. We then propose an integrated scheduling approach that effectively improves sequence-search throughput by gracefully coordinating the dynamic load balancing of computation and high-performance noncontiguous I/O.
INDEX TERMS
Scheduling, parallel I/O, bioinformatics, parallel genomic sequence search, BLAST.
CITATION
Heshan Lin, Xiaosong Ma, Wuchun Feng, Nagiza F. Samatova, "Coordinating Computation and I/O in Massively Parallel Sequence Search", IEEE Transactions on Parallel & Distributed Systems, vol.22, no. 4, pp. 529-543, April 2011, doi:10.1109/TPDS.2010.101
REFERENCES
[1] D. Benson, I. Karsch-Mizrachi, D. Lipman, J. Ostell, and D. Wheeler, "GenBank," Nucleic Acids Research, vol. 30, no. 1, pp. 17-20, Jan. 2008.
[2] J. Ostell, "Databases of Discovery," ACM Queue, vol. 3, no. 3, pp. 40-48, 2005.
[3] Nat'l Research Council, The New Science of Metagenomics: Revealing the Secrets of Our Microbial Planet. Nat'l Academy of Sciences, 2007.
[4] S. Schwartz, J. Kent, A. Smit, Z. Zhang, R. Baertsch, R. Hardison, D. Haussler, and W. Miller, "Human-Mouse Alignments with BLASTZ," Genome Res., vol. 13, pp. 103-107, 2003.
[5] M. Gardner, W. Feng, J. Archuleta, H. Lin, and X. Ma, "Parallel Genomic Sequence-Searching on an Ad-Hoc Grid: Experiences, Lessons Learned, and Implications," Proc. ACM/IEEE SC2006 Conf. High Performance Networking and Computing, 2006.
[6] A. Ching, W. Feng, H. Lin, X. Ma, and A. Choudhary, "Exploring I/O Strategies for Parallel Sequence Database Search Tools with S3aSim," Proc. Int'l Symp. High Performance Distributed Computing, June 2006.
[7] S. Altschul, W. Gish, W. Miller, E. Myers, and D. Lipman, "Basic Local Alignment Search Tool," J. Molecular Biology, vol. 215, no. 3, pp. 403-410, 1990.
[8] S. Altschul, T. Madden, A. Schffer, J. Zhang, Z. Zhang, W. Miller, and D. Lipman, "Gapped BLAST and PSI-BLAST: A New Generation of Protein Database Search Programs," Nucleic Acids Research, vol. 25, no. 17, pp. 3389-3402, 1997.
[9] M. Warren and J. Salmon, "A Parallel Hashed Oct-Tree N-Body Algorithm," Proc. ACM/IEEE Conf. Supercomputing, 1993.
[10] J. Chen and V. Taylor, "Mesh Partitioning for Distributed Systems: Exploring Optimal Number of Partitions with Local and Remote Communication," Proc. SIAM Conf. Parallel Processing for Scientific Computing (PPSC), 1999.
[11] K. Schloegel, G. Karypis, and V. Kumar, "Dynamic Repartitioning of Adaptively Refined Meshes," Proc. ACM/IEEE Conf. Supercomputing, 1998.
[12] A. Sohn and H. Simon, "S-HARP: A Scalable Parallel Dynamic Partitioner for Adaptive Mesh-Based Computations," Proc. Supercomputing (SC '98), citeseer.ist.psu.edu/articlesohn98sharp.html , 1998.
[13] S. Hummel, E. Schonberg, and L. Flynn, "Factoring: A Method for Scheduling Parallel Loops," Comm. ACM, vol. 35, no. 8, pp. 90-101, 1992.
[14] S. Hummel, J. Schmidt, R. Uma, and J. Wein, "Load-Sharing in Heterogeneous Systems via Weighted Factoring," Proc. Eighth Ann. ACM Symp. Parallel Algorithms and Architectures (SPAA), 1996.
[15] I. Banicescu and S. Hummel, "Balancing Processor Loads and Exploiting Data Locality in N-Body Simulations," Proc. ACM/IEEE Conf. Supercomputing, 1995.
[16] I. Banicescu and V. Velusamy, "Load Balancing Highly Irregular Computations with the Adaptive Factoring," Proc. 16th IEEE CS Int'l Parallel and Distributed Processing Symp. (IPDPS '02), p. 195, 2002.
[17] I. Banicescu, V. Velusamy, and J. Devaprasad, "On the Scalability of Dynamic Scheduling Scientific Applications with Adaptive Weighted Factoring," Cluster Computing, vol. 6, no. 3, pp. 215-226, 2003.
[18] R. Thakur and A. Choudhary, "An Extended Two-Phase Method for Accessing Sections of Out-of-Core Arrays," Scientific Programming, vol. 5, no. 4, pp. 301-317, 1996.
[19] R. Thakur, W. Gropp, and E. Lusk, "On Implementing MPI-IO Portably and with High Performance," Proc. Sixth Workshop I/O in Parallel and Distributed Systems, May 1999.
[20] R. Thakur, W. Gropp, and E. Lusk, "Optimizing Noncontiguous Accesses in MPI-IO," Parallel Computing, vol. 28, no. 1, pp. 83-105, Jan. 2002.
[21] A. Ching, A. Choudhary, W. Keng Liao, R. Ross, and W. Gropp, "Noncontiguous I/O through PVFS," Proc. IEEE CS Int'l Conf. Cluster Computing (CLUSTER '02), 2002.
[22] A. Ching, A. Choudhary, K. Coloma, W. Keng Liao, R. Ross, and W. Gropp, "Noncontiguous I/O Accesses through MPI-IO," Proc. Third IEEE CS Int'l Symp. Cluster Computing and the Grid (CCGRID), 2003.
[23] F. Isaila and W. Tichy, "View I/O: Improving the Performance of Non-Contiguous I/O," Proc. IEEE Int'l Conf. Cluster Computing, Dec. 2003.
[24] A. Darling, L. Carey, and W. Feng, "The Design, Implementation, and Evaluation of mpiBLAST," Proc. ClusterWorld Conf. and Expo, in conjunction with the Fourth Int'l Conf. Linux Clusters: the HPC Revolution, 2003.
[25] T. Smith and M. Waterman, "Identification of Common Molecular Subsequences," J. Molecular Biology, vol. 147, pp. 195-197, 1981.
[26] S. Needleman and C. Wunsch, "A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins," J. Molecular Biology, vol. 48, no. 3, pp. 443-453, 1970.
[27] D. Lipman and W. Pearson, "Improved Tools for Biological Sequence Comparison," Proc. Nat'l Acad. Sci., vol. 85, no. 8, pp. 2444-2448, 1988.
[28] R. Luthy and C. Hoover, "Hardware and Software Systems for Accelerating Common Bioinformatics Sequence Analysis Algorithms," Biosilico, vol. 2, no. 1, pp. 12-17, 2004.
[29] C. White, R. Singh, P. Reintjes, J. Lampe, B. Erickson, W. Dettloff, V. Chi, and S. Altschul, "BioSCAN: A VLSI-Based System for Biosequence Analysis," Proc. IEEE CS Int'l Conf. Computer Design on VLSI in Computer and Processors (ICCD), 1991.
[30] "Bioccerator," http:/eta.embl-heidelberg.de:8000/, Compugen Ltd., 1994.
[31] R. Braun, K. Pedretti, T. Casavant, T. Scheetz, C. Birkett, and C. Roberts, "Parallelization of Local Blast Service on Workstation Clusters," Future Generation Computer Systems, vol. 17, no. 6, pp. 745-754, 2001.
[32] N. Camp, H. Cofer, and R. Gomperts, "High-Throughput BLAST," http://www.sgi.com/industries/sciences/chembio/ resources/papers/HTBlastHT_Whitepaper.html , 2010.
[33] E. Chi, E. Shoop, J. Carlis, E. Retzel, and J. Riedl, "Efficiency of Shared-Memory Multiprocessors for a Genetic Sequence Similarity Search Algorithm," Technical Report TR97-005, Univ. of Minnesota, Computer Science Dept., 1997.
[34] R. Bjornson, A. Sherman, S. Weston, N. Willard, and J. Wing, "TurboBLAST(r): A Parallel Implementation of BLAST Built on the TurboHub," Proc. Int'l Parallel and Distributed Processing Symp., 2002.
[35] D. Mathog, "Parallel BLAST on Split Databases," Bioinformatics, vol. 19, no. 14, pp. 1865-1866, 2003.
[36] H. Lin, X. Ma, P. Chandramohan, A. Geist, and N. Samatova, "Efficient Data Access for Parallel BLAST," Proc. 19th IEEE CS Int'l Parallel and Distributed Processing Symp. (IPDPS '05), 2005.
[37] H. Lin, P. Balaji, R. Poole, C. Sosa, X. Ma, and W. Feng, "Massively Parallel Genomic Sequence Search on the Blue Gene/P Architecture," Proc. Int'l Conf. High Performance Computing, Networking, Storage and Analysis (SC '08), 2008.
[38] H. Rangwala, E. Lantz, R. Musselman, K. Pinnow, B. Smith, and B. Wallenfelt, "Massively Parallel BLAST for the Blue Gene/L," Proc. High Availability and Performance Workshop, 2005.
[39] C. Oehmen and J. Nieplocha, "ScalaBLAST: A Scalable Implementation of BLAST for High-Performance Data-Intensive Bioinformatics Analysis," IEEE Trans. Parallel and Distributed Systems, vol. 17, no. 8, pp. 740-749, Aug. 2006.
[40] J. Nieplocha, R. Harrison, and R. Littlefield, "Global Arrays: A Nonuniform Memory Access Programming Model for High-Performance Computers," The J. Supercomputing, vol. 10, no. 2, pp. 169-189, 1996.
[41] O. Thorsen, K. Jian, A. Peters, B. Smith, H. Lin, W. Feng, and C. Sosa, "Parallel Genomic Sequence-Search on a Massively Parallel System," Proc. ACM Int'l Conf. Computing Frontiers, 2007.
[42] J. Dean and S. Ghemawat, "Mapreduce: Simplified Data Processing on Large Clusters," Comm. ACM, vol. 51, no. 1, pp. 107-113, 2008.
[43] C. Moretti, H. Bui, K. Hollingsworth, B. Rich, P. Flynn, and D. Thain, "All-Pairs: An Abstraction for Data-Intensive Computing on Campus Grids," IEEE Trans. Parallel and Distributed Systems, vol. 21, no. 1, pp. 33-46, Jan. 2010.
[44] A. Matsunaga, M. Tsugawa, and J. Fortes, "Cloudblast: Combining Mapreduce and Virtualization on Distributed Resources for Bioinformatics Applications," Proc. Fourth IEEE CS Int'l Conf. eScience (ESCIENCE '08), pp. 222-229, 2008.
[45] S. Ghemawat, H. Gobioff, and S.-T. Leung, "The Google File System," ACM SIGOPS Operating Systems Rev., vol. 37, no. 5, pp. 29-43, 2003.
[46] R. Thakur, A. Choudhary, R. Bordawekar, S. More, and S. Kuditipudi, "Passion: Optimized I/O for Parallel Applications," Computer, vol. 29, no. 6, pp. 70-78, June 1996.
[47] J. May, Parallel I/O for High Performance Computing. Morgan Kaufmann Publishers, 2001.
[48] J. Bent, G. Gibson, G. Grider, B. McClelland, P. Nowoczynski, J. Nunez, M. Polte, and M. Wingate, "Plfs: A Checkpoint Filesystem for Parallel Applications," Proc. Conf. High Performance Computing Networking, Storage and Analysis, pp. 1-12, 2009.
[49] MPI-2: Extensions to the Message-Passing Standard, Message Passing Interface Forum, July 1997.
[50] R. Thakur, W. Gropp, and E. Lusk, "Data Sieving and Collective I/O in ROMIO," Proc. Seventh Symp. Frontiers of Massively Parallel Computation, Feb. 1999.
[51] F. Schmuck and R. Haskin, "GPFS: A Shared-Disk File System for Large Computing Clusters," Proc. First Conf. File and Storage Technologies, 2002.
[52] "ZFS at OpenSolaris.org," http://www.opensolaris.org/os/ community zfs/, 2010.
[53] K. Jiang, O. Thorsen, A. Peters, B. Smith, and C.P. Sosa, "An Efficient Parallel Implementation of the Hidden Markov Methods for Genomic Sequence-Search on a Massively Parallel System," IEEE Trans. Parallel and Distributed Systems, vol. 19, no. 1, pp. 15-23, Jan. 2007.
[54] C. Wu and A. Kalyanaraman, "An Efficient Parallel Approach for Identifying Protein Families in Large-Scale Metagenomic Data Sets," Proc. ACM/IEEE Conf. Supercomputing, pp. 1-10, 2008.
32 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool