The Community for Technology Leaders
Green Image
<p><b>Abstract</b>—We consider the problem of efficiently performing a reduce-scatter operation in a message passing system. Reduce-scatter is the composition of an element-wise reduction on vectors of <it>n</it> elements initially held by <it>n</it> processors, with a scatter of the resulting vector among the processors. In this paper, we present two algorithms for the reduce-scatter operation, designed in LogGP. The first algorithm assumes an associative and commutative reduction operator and it is optimal in LogGP within a small constant factor. The second algorithm allows the reduction operator to be noncommutative, and it is asymptotically optimal when values to be combined are large arrays. To achieve these results, we developed a complete analysis of both algorithms in LogGP, including the derivation of lower bounds for the reduce-scatter operation, and the study of the <it>m</it>-item version of the problem, i.e., the case when the initial elements are vectors themselves. Reduce-scatter has been included as a collective operation in the MPI standard message passing library, and can be used, for instance, in parallel matrix-vector multiply when the matrix is decomposed by columns. To model a message passing system, we adopted the LogGP model, an extension of LogP that allows the modeling of messages of different length. While this choice makes the analysis somewhat more complex, it leads to more realistic results in the case of gather/scatter algorithms.</p>
Reduce-scatter, algorithm analysis, parallel algorithm, collective communication operations, LogP, LogGP, postal model, generalized Fibonacci numbers, MPI.

G. Iannello, "Efficient Algorithms for the Reduce-Scatter Operation in LogGP," in IEEE Transactions on Parallel & Distributed Systems, vol. 8, no. , pp. 970-982, 1997.
97 ms
(Ver 3.3 (11022016))