Improving Performance of Dynamic Programming via Parallelism and Locality on Multicore Architectures
Issue No. 02 - February (2009 vol. 20)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TPDS.2008.78
Guangming Tan , Institute of Computing Technology, Chinese Academy of Sciences, Beijing
Ninghui Sun , Institute of Computing Technology, Chinese Academy of Sciences, Beijing
Guang R. Gao , University of Delaware, Newark
Dynamic programming (DP) is a popular technique which is used to solve combinatorial search and optimization problems. This paper focuses on one type of DP, which is called nonserial polyadic dynamic programming (NPDP). Owing to the nonuniform data dependencies of NPDP, it is difficult to exploit either parallelism or locality. Worse still, the emerging multi/many-core architectures with small on-chip memory make these issues more challenging. In this paper, we address the challenges of exploiting the fine grain parallelism and locality of NPDP on multicore architectures. We describe a latency-tolerant model and a percolation technique for programming on multicore architectures. On an algorithmic level, both parallelism and locality do benefit from a specific data dependence transformation of NPDP. Next, we propose a parallel pipelining algorithm by decomposing computation operators and percolating data through a memory hierarchy to create just-in-time locality. In order to predict the execution time, we formulate an analytical performance model of the parallel algorithm. The parallel pipelining algorithm achieves not only high scalability on the 160-core IBM Cyclops64, but portable performance as well, across the 8-core Sun Niagara and quad-cores Intel Clovertown.
Dynamic programming, memory hierarchy, latency tolerant, percolation, multicore.
G. Tan, N. Sun and G. R. Gao, "Improving Performance of Dynamic Programming via Parallelism and Locality on Multicore Architectures," in IEEE Transactions on Parallel & Distributed Systems, vol. 20, no. , pp. 261-274, 2008.