Subscribe
Issue No.05 - May (2001 vol.50)
pp: 519-525
ABSTRACT
<p><b>Abstract</b>—The known fast sequential algorithms for multiplying two <tmath>$N\times N$</tmath> matrices (over an arbitrary ring) have time complexity <tmath>$O(N^\alpha)$</tmath>, where <tmath>$2 < \alpha < 3$</tmath>. The current best value of <tmath>$\alpha$</tmath> is less than 2.3755. We show that, for all <tmath>$1 \le p \le N^{\alpha}$</tmath>, multiplying two <tmath>$N\times N$</tmath> matrices can be performed on a <it>p</it>-processor linear array with a reconfigurable pipelined bus system (LARPBS) in <tmath>$O({N^{\alpha}\over p}+({N^2\over p^{2/\alpha}})\log p)$</tmath> time. This is currently the fastest parallelization of the best known sequential matrix multiplication algorithm on a distributed memory parallel system. In particular, for all <tmath>$1 \le p \le N^{2.3755}$</tmath>, multiplying two <tmath>$N\times N$</tmath> matrices can be performed on a <it>p</it>-processor LARPBS in <tmath>$O({N^{2.3755}\over p}+({N^2\over p^{0.8419}})\log p)$</tmath> time and linear speedup can be achieved for <tmath>$p$</tmath> as large as <tmath>$O(N^{2.3755}/(\log N)^{6.3262})$</tmath>. Furthermore, multiplying two <tmath>$N\times N$</tmath> matrices can be performed on an LARPBS with <tmath>$O(N^\alpha)$</tmath> processors in <tmath>$O(\log N)$</tmath> time. This compares favorably with the performance on a PRAM.</p>
INDEX TERMS
Bilinear algorithm, cost-optimality, distributed memory system, linear array, matrix multiplication, optical pipelined bus, PRAM, reconfigurable system, speedup.
CITATION
Keqin Li, Victor Y. Pan, "Parallel Matrix Multiplication on a Linear Array with a Reconfigurable Pipelined Bus System", IEEE Transactions on Computers, vol.50, no. 5, pp. 519-525, May 2001, doi:10.1109/12.926164
16 ms
(Ver 2.0)

Marketing Automation Platform