The Community for Technology Leaders
High Performance Computing and Grid in Asia Pacific Region, International Conference on (1997)
Seoul, Korea
Apr. 28, 1997 to May 2, 1997
ISBN: 0-8186-7901-8
pp: 224
Jaeyoung Choi , Soongsil University
ABSTRACT
We present a new parallel matrix multiplication algorithm on distributed memory concurrent computers, which is fast and scalable, and whose performance is independent of data distribution on processors, and call it DIMMA (Distribution-Independent Matrix Multiplication Algorithm). The algorithm is based on two new ideas; it uses a modified pipelined communication scheme to overlap computation and communication effectively, and exploits the LCM block concept to obtain the maximum performance of the sequential BLAS routine in each processor even when the block size is very small as well as very large. The algorithm is implemented and compared with SUMMA on the Intel Paragon computer.
INDEX TERMS
CITATION
Jaeyoung Choi, "A New Parallel Matrix Multiplication Algorithm on Distributed-Memory Concurrent Computers", High Performance Computing and Grid in Asia Pacific Region, International Conference on, vol. 00, no. , pp. 224, 1997, doi:10.1109/HPC.1997.592151
98 ms
(Ver 3.3 (11022016))