Issue No. 10 - October (2001 vol. 50)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/12.956089
<p><b>Abstract</b>—All-to-all communication is one of the most dense communication patterns and occurs in many important applications in parallel computing. In this paper, we present a new all-to-all broadcast algorithm in all-port meshes and tori. The algorithm utilizes a controlled message flooding based on a novel broadcast pattern, which ensures a balanced traffic load in all dimensions in the network so that the optimal transmission time for all-to-all broadcast can be achieved. The broadcast pattern is described in a formal, generic way for each node in terms of a few simple operations and can be easily built into router hardware. Unlike existing all-to-all broadcast algorithms, the new algorithm overlaps message switching time with transmission time in a pipelined fashion to reduce the total communication delay of all-to-all broadcast. In most cases, the total communication delay is close to the lower bound of all-to-all broadcast within a small constant range. Finally, the algorithm is conceptually simple and symmetrical for every message and every node so that it can be easily implemented in hardware and achieves the optimum in practice.</p>
Parallel computing, collective communication, all-to-all communication, all-to-all broadcast, gossip, broadcast tree, routing, interprocessor communication.
Jianchao Wang, Yuanyuan Yang, "Pipelined All-to-All Broadcast in All-Port Meshes and Tori", IEEE Transactions on Computers, vol. 50, no. , pp. 1020-1032, October 2001, doi:10.1109/12.956089