Issue No.04 - April (2003 vol.14)
Heejo Lee , IEEE
Jong Kim , IEEE
Sung Je Hong , IEEE
Sunggu Lee , IEEE
<p><b>Abstract</b>—The problem of finding an optimal product sequence for sequential multiplication of a chain of matrices (the matrix chain ordering problem, MCOP) is well-known and has been studied for a long time. In this paper, we consider the problem of finding an optimal product schedule for evaluating a chain of matrix products on a parallel computer (the matrix chain scheduling problem, MCSP). The difference between the MCSP and the MCOP is that the MCOP pertains to a product sequence for single processor systems and the MCSP pertains to a sequence of concurrent matrix products for parallel systems. The approach of parallelizing each matrix product after finding an optimal product sequence for single processor systems does not always guarantee the minimum evaluation time on parallel systems since each parallelized matrix product may use processors inefficiently. We introduce a new processor scheduling algorithm for the MCSP which reduces the evaluation time of a chain of matrix products on a parallel computer, even at the expense of a slight increase in the total number of operations. Given a chain of <tmath> n </tmath> matrices and a matrix product utilizing at most <em>P/k</em> processors in a <em>P-processor</em> system, the proposed algorithm approaches <em>k</em>(<em>n</em> - 1) / ( <em>n</em> + <em>k</em> log(<em>k</em>)-<em>k</em>) times the performance of parallel evaluation using the optimal sequence found for the MCOP. Also, experiments performed on a Fujitsu AP1000 multicomputer show that the proposed algorithm significantly decreases the time required to evaluate a chain of matrix products in parallel systems.</p>
Matrix chain product, parallel matrix multiplication, matrix chain scheduling problem, processor allocation, task scheduling.
Heejo Lee, Jong Kim, Sung Je Hong, Sunggu Lee, "Processor Allocation and Task Scheduling of Matrix Chain Products on Parallel Systems", IEEE Transactions on Parallel & Distributed Systems, vol.14, no. 4, pp. 394-407, April 2003, doi:10.1109/TPDS.2003.1195411