10th Euromicro Workshop on Parallel, Distributed and Network-based Processing (EUROMICRO-PDP 2002)
Efficient Implementation of Reduce-Scatter in MPI
Canary Islands, Spain
January 09-January 11
ISBN: 0-7695-1444-8
We discuss the efficient implementation of the MPI collective operation called reduce-scatter. We describe the implementation issues and the performance characterization of two algorithms for the reduce-scatter that have been proven to be highly efficient in theory under the assumption of fully connected parallel system. A performance comparison with existing mainstream implementations of the operation is presented which confirms the practical advantage of the new algorithms. Experiments show that the two algorithms have different characteristics which make them complementary in providing a performance gain over standard algorithms.
Index Terms:
MPI, Collective Communication, LogGP model
Citation:
Massimo Bernaschi, Giulio Iannello, Mario Lauria, "Efficient Implementation of Reduce-Scatter in MPI," pdp, pp.0301, 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing (EUROMICRO-PDP 2002), 2002