Fifth IEEE International Conference on Cluster Computing (CLUSTER'03)
Efficient Parallel Out-of-Core Matrix Transposition
Hong Kong
December 01-December 04
ISBN: 0-7695-2066-9
This paper addresses the problem of parallel transposition of large out-of-core arrays. Although algorithms for out-of-core matrix transposition have been widely studied, previously proposed algorithms have sought to minimize the number of I/O operations and the in-memory permutation time.We propose an algorithm that directly targets the improvement of overall transposition time. The I/O characteristics of the system are used to determine the read, write and communication block sizes such that the total execution time is minimized. We also provide a solution to the array redistribution problem for arrays on disk. The solution to the sequential transposition problem and the parallel array redistribution problem are then combined to obtain an algorithm for the parallel out-of-core transposition problem.
Citation:
Sriram Krishnamoorthy, Gerald Baumgartner, Daniel Cociorva, Chi-Chung Lam, P. Sadayappan, "Efficient Parallel Out-of-Core Matrix Transposition," cluster, pp.300, Fifth IEEE International Conference on Cluster Computing (CLUSTER'03), 2003