This Article 
 Bibliographic References 
 Add to: 
EasyPDP: An Efficient Parallel Dynamic Programming Runtime System for Computational Biology
May 2012 (vol. 23 no. 5)
pp. 862-872
Shanjiang Tang, Tianjin University, Tianjin
Ce Yu, Tianjin University, Tianjin
Jizhou Sun, Tianjin University, Tianjin
Bu-Sung Lee, Nanyang Technological University, Singapore
Tao Zhang, Tianjin University, Tianjin
Zhen Xu, Tianjin University, Tianjin
Huabei Wu, Tianjin University, Tianjin
Dynamic programming (DP) is a popular and efficient technique in many scientific applications such as computational biology. Nevertheless, its performance is limited due to the burgeoning volume of scientific data, and parallelism is necessary and crucial to keep the computation time at acceptable levels. The intrinsically strong data dependency of dynamic programming makes it difficult and error-prone for the programmer to write a correct and efficient parallel program. Therefore, this paper builds a runtime system named EasyPDP aiming at parallelizing dynamic programming algorithms on multicore and multiprocessor platforms. Under the concept of software reusability and complexity reduction of parallel programming, a DAG Data Driven Model is proposed, which supports those applications with a strong data interdependence relationship. Based on the model, EasyPDP runtime system is designed and implemented. It automatically handles thread creation, dynamic data task allocation and scheduling, data partitioning, and fault tolerance. Five frequently used DAG patterns from biological dynamic programming algorithms have been put into the DAG pattern library of EasyPDP, so that the programmer can choose to use any of them according to his/her specific application. Besides, an ideal computing distribution model is proposed to discuss the optimal values for the performance tuning arguments of EasyPDP. We evaluate the performance potential and fault tolerance feature of EasyPDP in multicore system. We also compare EasyPDP with other methods such as Block-Cycle Wavefront (BCW). The experimental results illustrate that EasyPDP system is fine and provides an efficient infrastructure for dynamic programming algorithms.

[1] J. Bowie, R. Luthy, and D. Eisenberg, "A Method to Identify Protein Sequences that Fold into a Known Three-Dimensional Structure," Science, vol. 253, no. 5016, pp. 164-170, 1991.
[2] C. Ranger et al., "Evaluating MapReduce for Multi-Core and Multiprocessor Systems," Proc. IEEE 13th Int'l Symp. High Performance Computer Architecture, pp. 13-24, 2007.
[3] E. Kohler, R. Morris, and B. Chen, "Programming Language Optimizations for Modular Router Configurations," Proc. 10th Int'l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp. 251-263, 2002.
[4] B.D. Carlstrom et al., "The ATOMO Transactional Programming Language," Proc. ACM SIGPLAN Conf. Programming Language Design and Implementation, pp. 1-13, June 2006.
[5] T. Harris and K. Fraser, "Language Support for Lightweight Transactions," Proc. 18th Ann. ACM Conf. Object-Oriented Programming, Systems, Languages, and Applications, Oct. 2003.
[6] S. Balakrishnan and G.S. Sohi, "Program Demultiplexing: Data-Flow Based Speculative Parallelization of Methods in Sequential Programs," Proc. 33rd Ann. Int'l Symp. Computer Architecture (ISCA '06), June 2006.
[7] C. Ciressan, E. Sanchez, M. Rajman, and J.C. Chappelier, "An FPGA-Based Coprocessor for the Parsing of Context-Free Grammars," Proc. IEEE Symp. Field-Programmable Custom Computing Machines, 2000.
[8] M. Farach and M. Thorup, "Optimal Evolutionary Tree Comparison by Sparse Dynamic Programming," Proc. 35th Ann. Symp. Foundations of Computer Science, pp. 770-779, 1994.
[9] W.G. Liu and B. Schmidt, "Parallel Design Pattern for Computational Biology and Scientific Computing Applications," Proc. IEEE Int'l Conf. Cluster Computing, pp. 456-459, 2003.
[10] V. Kumar, A. Grama, A. Gupa, and G. Karypis, Introduction to Parallel Computing. Benjamin/Cummings Publishing Company, Inc., 1994.
[11] R.A. Chowdhury and V. Ramachandran, "Cache-Efficient Dynamic Programming Algorithms for Multicores," Proc. 20th Ann. Symp. Parallelism in Algorithms and Architectures, pp. 207-216, 2008.
[12] R.A. Chowdhury, H.S. Le, and V. Ramachandran, "Cache-Oblivious Dynamic Programming for Bioinformatics," IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 7, no. 3, pp. 495-510, July-Sept. 2009.
[13] R.A. Chowdhury and V. Ramachandran, "Cache-Oblivious Dynamic Programming," Proc. 17th Ann. ACM-SIAM Symp. Discrete Algorithms, pp. 591-600, 2006.
[14] R.A. Chowdhury, H. Le, and V. Ramachandran, Efficient Cache-Oblivious String Algorithms for Bioinformatics, Technical Report TR-07-03, Dept. of Computer Sciences, Univ. of Texas, Feb. 2007.
[15] G. Blelloch, R. Chowdhury, P. Gibbons, V. Ramachandran, S. Chen, and M. Kozuch, "Provably Good Multicore Cache Performance for Divide-and-Conquer Algorithms," Proc. 19th Ann. ACM-SIAM Symp. Discrete Algorithms, pp. 501-510, 2008.
[16] Z. Galil and K. Park, "Dynamic Programming with Convexity, Concavity and Sparsity," Theoretical Computer Science, vol. 92, pp. 49-76, 1992.
[17] X. Huang and K.M. Chao, "A Generalized Global Alignment Algorithm," Bioinformatics, vol. 19, no. 2, pp. 228-233, 2003.
[18] N. Futamura, S. Aluru, and X. Huang, "Parallel Syntenic Alignments," HiPC '02: Proc. Ninth Int'l Conf. High Performance Computing, pp. 420-430, 2002.
[19] T. Smith and M. Waterman, "Identification of Common Molecular Subsequences," J. Molecular Biology, vol. 147, no. 1, pp. 195-197, 1981.
[20] R. Durbin, S. Eddy, A. Krogh, and G. Mitchison, Biological Sequence Analysis: Probabilistic Models of Protein and Nucleic Acids. Cambridge Univ. Press, 1998.
[21] R. Nussinov, G. Pieczenik, J.R. Griggs, and D.J. Kleitman, "Algorithms for Loop Matchings," SIAM J. Applied Math., vol. 35, no. 1, pp. 68-82, 1978.
[22] A. Viterbi, "Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm," IEEE Trans. Information Theory, vol. TIT-13, no. 2, pp. 260-269, Apr. 1967.
[23] D.W. Mount, Bioinformatics-Sequence and Genome Analysis. Cold Spring Harbor Laboratory Press, 2001.
[24] M.S. Gelfand, A.A. Mironov, and P.A. Pevzner, "Gene Recognition via Spliced Sequence Alignment," Proc. Nat'l Academy of Sciences of USA, vol. 93, no. 17, pp. 9061-9066, 1996.
[25] M. Zuker and P. Stiegler, "Optimal Computer Folding of Large RNA Sequences Using Thermodynamics and Auxiliary Information," Nucleic Acids Research, vol. 9, no. 1, pp. 133-148, 1981.
[26] B. Lewis and D.J. Berg, Multithreaded Programming with Pthreads. Prentice Hall, 1998.
[27] S. Mitra, N. Seifert, M. Zhang, Q. Shi, and K.S. Kim, "Robust System Design with Built-In Soft-Error Resilience," Computer, vol 38, no. 2, pp. 43-52, 2005.
[28] J.C. Smolens et al., "Fingerprinting: Bounding Soft-Error Detection Latency and Bandwidth," Proc. 11th Int'l Conf. Architectural Support for Programming, Languages and Operating Systems, pp. 224-234, Oct. 2004.
[29] P. Edmonds, E. Chu, and A. George, "Dynamic Programming on a Shared Memory Multiprocessor," Parallel Computing, vol. 19, no. 1, pp. 9-22, 1993.
[30] Z. Galil and K. Park, "Parallel Algorithm for Dynamic Programming Recurrences with More than O(1) Dependency," J. Parallel and Distributed Computing, vol. 21, no. 2, pp. 213-222, 1994.
[31] P.G. Bradford, "Efficient Parallel Dynamic Programming," Proc. 30th Ann. Allerton Conf. Comm. Control and Computing, pp. 185-194, 1992.
[32] A. Mark and S. Ramesh, "PC Software Performance Tuning," Computer, vol. 29, no. 8, pp. 47-54, 1996.
[33] G.M. Tan, H.N. Sun, and R.G. Gao, "A Parallel Dynamic Programming Algorithm on a Multi-Core Architecture," Proc. 19th Ann. ACM Symp. Parallel Algorithms and Architectures, pp. 135-144, 2007.
[34] G.M. Tan et al., "Locality and Parallelism Optimization for Dynamic Programming Algorithm in Bioinformatics," Proc. ACM/IEEE Conf. Supercomputing (SC '06), pp. 11-17, 2006.
[35] F. Almeida, R. Andonov, and D. Gonzalez, "Optimal Tiling for RNA Base Pairing Problem," Proc. 14th Ann. ACM Symp. Parallel Algorithm and Architecture (SPAA '02), pp. 173-182, 2002.
[36] W. Zhou and D.K. Lowenthal, "A Parallel, Out-of-Core Algorithm for RNA Secondary Structure Prediction," Proc. Int'l Conf. Parallel Processing (ICPP '06), pp. 74-81, 2006.
[37] J.S. Vitter, "External Memory Algorithms and Data Structures: Dealing with Massive Data," ACM Computing Surveys, vol. 33, no. 2, pp. 209-271, 2001.
[38] W. Liu and B. Schmidt, "A Generic Parallel Pattern-Based System for Bioinformatics," Proc. EURO-PAR, pp. 989-996, 2004.
[39] W. Liu and B. Schmidt, "Parallel Pattern-Based Systems for Computational Biology: A Case Study," IEEE Trans. Parallel and Distributed Systems, vol. 17, no. 8, pp. 750-763, Aug. 2006.
[40] J. Dean and J. Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters," Comm. ACM, vol. 51, no. 1, pp. 107-113, 2008.
[41] M.I. Gordon et al., "A Stream Compiler for Communication-Exposed Architectures," Proc. 10th Int'l Conf. Architectural Support for Programming Languages and Operating Systems, pp. 291-303, Oct. 2002.
[42] D. Hirschberg, "A Linear Space Algorithm for Computing Maximal Common Subsequences," Comm. ACM, vol. 18, no. 6, pp. 341-343, 1975.
[43] X. Huang, "A Space-Efficient Parallel Sequence Comparison Algorithm for a Message-Passing Multiprocessor," Int'l J. Parallel Programming, vol. 18, no. 3, pp. 223-239, 1989.
[44] S. Rajko and S. Aluru, "Space and Time Optimal Parallel Sequence Alignments," IEEE Trans. Parallel and Distributed Systems, vol. 15, no. 12, pp. 1070-1081, Dec. 2004.
[45] U.A. Acar, G.E. Blelloch, and R.D. Blumofe, "The Data Locality of Work Stealing," Theory of Computing Systems, vol. 35, no. 3, pp. 321-347, 2002.
[46] M. Frigo, C.E. Leiserson, and K.H. Randall, "The Implementation of the Cilk-5 Multithreaded Language," Proc. ACM SIGPLAN Conf. Programming Language Design and Implementation, pp. 212-223, 1998.
[47] G.E. Blelloch and P.B. Gibbons, "Effectively Sharing a Cache among Threads," Proc. 16th Ann. ACM Symp. Parallelism in Algorithms and Architectures, pp. 235-244, 2004.
[48] G.E. Blelloch, P.B. Gibbons, and Y. Matias, "Provably Efficient Scheduling for Languages with Fine-Grained Parallelism," J. ACM, vol. 46, no. 2, pp. 281-321, 1999.
[49] G.E. Blelloch, P.B. Gibbons, G.J. Narlikar, and Y. Matias, "Space-Efficient Scheduling of Parallelism with Synchronization Variables," Proc. Ninth Ann. ACM Symp. Parallel Algorithms and Architectures, pp. 12-23, 1997.

Index Terms:
Dynamic programming, Easypdp, DAG data driven model, fault tolerance, DAG pattern, multicore, block cycle.
Shanjiang Tang, Ce Yu, Jizhou Sun, Bu-Sung Lee, Tao Zhang, Zhen Xu, Huabei Wu, "EasyPDP: An Efficient Parallel Dynamic Programming Runtime System for Computational Biology," IEEE Transactions on Parallel and Distributed Systems, vol. 23, no. 5, pp. 862-872, May 2012, doi:10.1109/TPDS.2011.218
Usage of this product signifies your acceptance of the Terms of Use.