
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
Heshan Lin, Xiaosong Ma, Wuchun Feng, Nagiza F. Samatova, "Coordinating Computation and I/O in Massively Parallel Sequence Search," IEEE Transactions on Parallel and Distributed Systems, vol. 22, no. 4, pp. 529543, April, 2011.  
BibTex  x  
@article{ 10.1109/TPDS.2010.101, author = {Heshan Lin and Xiaosong Ma and Wuchun Feng and Nagiza F. Samatova}, title = {Coordinating Computation and I/O in Massively Parallel Sequence Search}, journal ={IEEE Transactions on Parallel and Distributed Systems}, volume = {22}, number = {4}, issn = {10459219}, year = {2011}, pages = {529543}, doi = {http://doi.ieeecomputersociety.org/10.1109/TPDS.2010.101}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Parallel and Distributed Systems TI  Coordinating Computation and I/O in Massively Parallel Sequence Search IS  4 SN  10459219 SP529 EP543 EPD  529543 A1  Heshan Lin, A1  Xiaosong Ma, A1  Wuchun Feng, A1  Nagiza F. Samatova, PY  2011 KW  Scheduling KW  parallel I/O KW  bioinformatics KW  parallel genomic sequence search KW  BLAST. VL  22 JA  IEEE Transactions on Parallel and Distributed Systems ER   
[1] D. Benson, I. KarschMizrachi, D. Lipman, J. Ostell, and D. Wheeler, "GenBank," Nucleic Acids Research, vol. 30, no. 1, pp. 1720, Jan. 2008.
[2] J. Ostell, "Databases of Discovery," ACM Queue, vol. 3, no. 3, pp. 4048, 2005.
[3] Nat'l Research Council, The New Science of Metagenomics: Revealing the Secrets of Our Microbial Planet. Nat'l Academy of Sciences, 2007.
[4] S. Schwartz, J. Kent, A. Smit, Z. Zhang, R. Baertsch, R. Hardison, D. Haussler, and W. Miller, "HumanMouse Alignments with BLASTZ," Genome Res., vol. 13, pp. 103107, 2003.
[5] M. Gardner, W. Feng, J. Archuleta, H. Lin, and X. Ma, "Parallel Genomic SequenceSearching on an AdHoc Grid: Experiences, Lessons Learned, and Implications," Proc. ACM/IEEE SC2006 Conf. High Performance Networking and Computing, 2006.
[6] A. Ching, W. Feng, H. Lin, X. Ma, and A. Choudhary, "Exploring I/O Strategies for Parallel Sequence Database Search Tools with S3aSim," Proc. Int'l Symp. High Performance Distributed Computing, June 2006.
[7] S. Altschul, W. Gish, W. Miller, E. Myers, and D. Lipman, "Basic Local Alignment Search Tool," J. Molecular Biology, vol. 215, no. 3, pp. 403410, 1990.
[8] S. Altschul, T. Madden, A. Schffer, J. Zhang, Z. Zhang, W. Miller, and D. Lipman, "Gapped BLAST and PSIBLAST: A New Generation of Protein Database Search Programs," Nucleic Acids Research, vol. 25, no. 17, pp. 33893402, 1997.
[9] M. Warren and J. Salmon, "A Parallel Hashed OctTree NBody Algorithm," Proc. ACM/IEEE Conf. Supercomputing, 1993.
[10] J. Chen and V. Taylor, "Mesh Partitioning for Distributed Systems: Exploring Optimal Number of Partitions with Local and Remote Communication," Proc. SIAM Conf. Parallel Processing for Scientific Computing (PPSC), 1999.
[11] K. Schloegel, G. Karypis, and V. Kumar, "Dynamic Repartitioning of Adaptively Refined Meshes," Proc. ACM/IEEE Conf. Supercomputing, 1998.
[12] A. Sohn and H. Simon, "SHARP: A Scalable Parallel Dynamic Partitioner for Adaptive MeshBased Computations," Proc. Supercomputing (SC '98), citeseer.ist.psu.edu/articlesohn98sharp.html , 1998.
[13] S. Hummel, E. Schonberg, and L. Flynn, "Factoring: A Method for Scheduling Parallel Loops," Comm. ACM, vol. 35, no. 8, pp. 90101, 1992.
[14] S. Hummel, J. Schmidt, R. Uma, and J. Wein, "LoadSharing in Heterogeneous Systems via Weighted Factoring," Proc. Eighth Ann. ACM Symp. Parallel Algorithms and Architectures (SPAA), 1996.
[15] I. Banicescu and S. Hummel, "Balancing Processor Loads and Exploiting Data Locality in NBody Simulations," Proc. ACM/IEEE Conf. Supercomputing, 1995.
[16] I. Banicescu and V. Velusamy, "Load Balancing Highly Irregular Computations with the Adaptive Factoring," Proc. 16th IEEE CS Int'l Parallel and Distributed Processing Symp. (IPDPS '02), p. 195, 2002.
[17] I. Banicescu, V. Velusamy, and J. Devaprasad, "On the Scalability of Dynamic Scheduling Scientific Applications with Adaptive Weighted Factoring," Cluster Computing, vol. 6, no. 3, pp. 215226, 2003.
[18] R. Thakur and A. Choudhary, "An Extended TwoPhase Method for Accessing Sections of OutofCore Arrays," Scientific Programming, vol. 5, no. 4, pp. 301317, 1996.
[19] R. Thakur, W. Gropp, and E. Lusk, "On Implementing MPIIO Portably and with High Performance," Proc. Sixth Workshop I/O in Parallel and Distributed Systems, May 1999.
[20] R. Thakur, W. Gropp, and E. Lusk, "Optimizing Noncontiguous Accesses in MPIIO," Parallel Computing, vol. 28, no. 1, pp. 83105, Jan. 2002.
[21] A. Ching, A. Choudhary, W. Keng Liao, R. Ross, and W. Gropp, "Noncontiguous I/O through PVFS," Proc. IEEE CS Int'l Conf. Cluster Computing (CLUSTER '02), 2002.
[22] A. Ching, A. Choudhary, K. Coloma, W. Keng Liao, R. Ross, and W. Gropp, "Noncontiguous I/O Accesses through MPIIO," Proc. Third IEEE CS Int'l Symp. Cluster Computing and the Grid (CCGRID), 2003.
[23] F. Isaila and W. Tichy, "View I/O: Improving the Performance of NonContiguous I/O," Proc. IEEE Int'l Conf. Cluster Computing, Dec. 2003.
[24] A. Darling, L. Carey, and W. Feng, "The Design, Implementation, and Evaluation of mpiBLAST," Proc. ClusterWorld Conf. and Expo, in conjunction with the Fourth Int'l Conf. Linux Clusters: the HPC Revolution, 2003.
[25] T. Smith and M. Waterman, "Identification of Common Molecular Subsequences," J. Molecular Biology, vol. 147, pp. 195197, 1981.
[26] S. Needleman and C. Wunsch, "A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins," J. Molecular Biology, vol. 48, no. 3, pp. 443453, 1970.
[27] D. Lipman and W. Pearson, "Improved Tools for Biological Sequence Comparison," Proc. Nat'l Acad. Sci., vol. 85, no. 8, pp. 24442448, 1988.
[28] R. Luthy and C. Hoover, "Hardware and Software Systems for Accelerating Common Bioinformatics Sequence Analysis Algorithms," Biosilico, vol. 2, no. 1, pp. 1217, 2004.
[29] C. White, R. Singh, P. Reintjes, J. Lampe, B. Erickson, W. Dettloff, V. Chi, and S. Altschul, "BioSCAN: A VLSIBased System for Biosequence Analysis," Proc. IEEE CS Int'l Conf. Computer Design on VLSI in Computer and Processors (ICCD), 1991.
[30] "Bioccerator," http:/eta.emblheidelberg.de:8000/, Compugen Ltd., 1994.
[31] R. Braun, K. Pedretti, T. Casavant, T. Scheetz, C. Birkett, and C. Roberts, "Parallelization of Local Blast Service on Workstation Clusters," Future Generation Computer Systems, vol. 17, no. 6, pp. 745754, 2001.
[32] N. Camp, H. Cofer, and R. Gomperts, "HighThroughput BLAST," http://www.sgi.com/industries/sciences/chembio/ resources/papers/HTBlastHT_Whitepaper.html , 2010.
[33] E. Chi, E. Shoop, J. Carlis, E. Retzel, and J. Riedl, "Efficiency of SharedMemory Multiprocessors for a Genetic Sequence Similarity Search Algorithm," Technical Report TR97005, Univ. of Minnesota, Computer Science Dept., 1997.
[34] R. Bjornson, A. Sherman, S. Weston, N. Willard, and J. Wing, "TurboBLAST(r): A Parallel Implementation of BLAST Built on the TurboHub," Proc. Int'l Parallel and Distributed Processing Symp., 2002.
[35] D. Mathog, "Parallel BLAST on Split Databases," Bioinformatics, vol. 19, no. 14, pp. 18651866, 2003.
[36] H. Lin, X. Ma, P. Chandramohan, A. Geist, and N. Samatova, "Efficient Data Access for Parallel BLAST," Proc. 19th IEEE CS Int'l Parallel and Distributed Processing Symp. (IPDPS '05), 2005.
[37] H. Lin, P. Balaji, R. Poole, C. Sosa, X. Ma, and W. Feng, "Massively Parallel Genomic Sequence Search on the Blue Gene/P Architecture," Proc. Int'l Conf. High Performance Computing, Networking, Storage and Analysis (SC '08), 2008.
[38] H. Rangwala, E. Lantz, R. Musselman, K. Pinnow, B. Smith, and B. Wallenfelt, "Massively Parallel BLAST for the Blue Gene/L," Proc. High Availability and Performance Workshop, 2005.
[39] C. Oehmen and J. Nieplocha, "ScalaBLAST: A Scalable Implementation of BLAST for HighPerformance DataIntensive Bioinformatics Analysis," IEEE Trans. Parallel and Distributed Systems, vol. 17, no. 8, pp. 740749, Aug. 2006.
[40] J. Nieplocha, R. Harrison, and R. Littlefield, "Global Arrays: A Nonuniform Memory Access Programming Model for HighPerformance Computers," The J. Supercomputing, vol. 10, no. 2, pp. 169189, 1996.
[41] O. Thorsen, K. Jian, A. Peters, B. Smith, H. Lin, W. Feng, and C. Sosa, "Parallel Genomic SequenceSearch on a Massively Parallel System," Proc. ACM Int'l Conf. Computing Frontiers, 2007.
[42] J. Dean and S. Ghemawat, "Mapreduce: Simplified Data Processing on Large Clusters," Comm. ACM, vol. 51, no. 1, pp. 107113, 2008.
[43] C. Moretti, H. Bui, K. Hollingsworth, B. Rich, P. Flynn, and D. Thain, "AllPairs: An Abstraction for DataIntensive Computing on Campus Grids," IEEE Trans. Parallel and Distributed Systems, vol. 21, no. 1, pp. 3346, Jan. 2010.
[44] A. Matsunaga, M. Tsugawa, and J. Fortes, "Cloudblast: Combining Mapreduce and Virtualization on Distributed Resources for Bioinformatics Applications," Proc. Fourth IEEE CS Int'l Conf. eScience (ESCIENCE '08), pp. 222229, 2008.
[45] S. Ghemawat, H. Gobioff, and S.T. Leung, "The Google File System," ACM SIGOPS Operating Systems Rev., vol. 37, no. 5, pp. 2943, 2003.
[46] R. Thakur, A. Choudhary, R. Bordawekar, S. More, and S. Kuditipudi, "Passion: Optimized I/O for Parallel Applications," Computer, vol. 29, no. 6, pp. 7078, June 1996.
[47] J. May, Parallel I/O for High Performance Computing. Morgan Kaufmann Publishers, 2001.
[48] J. Bent, G. Gibson, G. Grider, B. McClelland, P. Nowoczynski, J. Nunez, M. Polte, and M. Wingate, "Plfs: A Checkpoint Filesystem for Parallel Applications," Proc. Conf. High Performance Computing Networking, Storage and Analysis, pp. 112, 2009.
[49] MPI2: Extensions to the MessagePassing Standard, Message Passing Interface Forum, July 1997.
[50] R. Thakur, W. Gropp, and E. Lusk, "Data Sieving and Collective I/O in ROMIO," Proc. Seventh Symp. Frontiers of Massively Parallel Computation, Feb. 1999.
[51] F. Schmuck and R. Haskin, "GPFS: A SharedDisk File System for Large Computing Clusters," Proc. First Conf. File and Storage Technologies, 2002.
[52] "ZFS at OpenSolaris.org," http://www.opensolaris.org/os/ community zfs/, 2010.
[53] K. Jiang, O. Thorsen, A. Peters, B. Smith, and C.P. Sosa, "An Efficient Parallel Implementation of the Hidden Markov Methods for Genomic SequenceSearch on a Massively Parallel System," IEEE Trans. Parallel and Distributed Systems, vol. 19, no. 1, pp. 1523, Jan. 2007.
[54] C. Wu and A. Kalyanaraman, "An Efficient Parallel Approach for Identifying Protein Families in LargeScale Metagenomic Data Sets," Proc. ACM/IEEE Conf. Supercomputing, pp. 110, 2008.