|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| Yuanyuan Yang, Jianchao Wang, "Near-Optimal All-to-All Broadcast in Multidimensional All-Port Meshes and Tori," IEEE Transactions on Parallel and Distributed Systems, vol. 13, no. 2, pp. 128-141, February, 2002. | |||
| BibTex | x | ||
| @article{ 10.1109/71.983941, author = {Yuanyuan Yang and Jianchao Wang}, title = {Near-Optimal All-to-All Broadcast in Multidimensional All-Port Meshes and Tori}, journal ={IEEE Transactions on Parallel and Distributed Systems}, volume = {13}, number = {2}, issn = {1045-9219}, year = {2002}, pages = {128-141}, doi = {http://doi.ieeecomputersociety.org/10.1109/71.983941}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Parallel and Distributed Systems TI - Near-Optimal All-to-All Broadcast in Multidimensional All-Port Meshes and Tori IS - 2 SN - 1045-9219 SP128 EP141 EPD - 128-141 A1 - Yuanyuan Yang, A1 - Jianchao Wang, PY - 2002 KW - Parallel computing KW - collective communication KW - all-to-all communication KW - all-to-all broadcast KW - gossip KW - broadcast tree KW - routing KW - interprocessor communication KW - mesh KW - torus. VL - 13 JA - IEEE Transactions on Parallel and Distributed Systems ER - | |||
Abstract—All-to-all communication is one of the most dense collective communication patterns and occurs in many important applications in parallel and distributed computing. In this paper, we present a new all-to-all broadcast algorithm in multidimensional all-port mesh and torus networks. We propose a broadcast pattern which ensures a balanced traffic load in all dimensions in the network so that the all-to-all broadcast algorithm can achieve a very tight near-optimal transmission time. The algorithm also takes advantage of overlapping of message switching time and transmission time, and the total communication delay asymptotically matches the lower bound of all-to-all broadcast. Finally, the algorithm is conceptually simple and symmetrical for every message and every node so that it can be easily implemented in hardware and achieves the near-optimum in practice.
[1] J. Duato, S. Yalamanchili, and L.M. Ni, Interconnection Networks: An Engineering Approach. Los Alamitos, Calif.: IEEE CS Press, 1997.
[2] D.S. Scott, "Efficient All-to-All Communication Patterns in Hypercube and Mesh Topologies," Proc. Sixth Conf. Distributed Memory Concurrent Computers, pp. 398-403, 1991.
[3] R. Thakur and A. Choudhary, "All-to-All Communication on Meshes with Wormhole Routing," Proc. Eighth Int'l Parallel Processing Symp., pp. 561-565, Apr. 1994.
[4] Y.-C. Tseng and S. Gupta, “All-to-All Personalized Communication in a Wormhole-Routed Torus,” IEEE Trans. Parallel and Distributed Systems, vol. 7, no. 5, pp. 498-505, May 1996.
[5] Y.-C. Tseng, T.-H. Lin, S. Gupta, and D.K. Panda, “Bandwidth-Optimal Complete Exchange on Wormhole Routed 2D/3D Torus Networks: A Diagonal-Propagation Approach,” IEEE Trans. Parallel and Distributed Systems, vol. 8, no. 4, pp. 380-396, Apr. 1997.
[6] F. Petrini, “Total-Exchange on Wormholek-Aryn-Cubes with Adaptive Routing,” Proc. First Merged IEEE Int'l Parallel Processing Symp. and Symp. Parallel and Distributed Processing, pp. 267-271, Mar. 1998.
[7] Y.J. Suh and S. Yalamanchili, “All-to-All Communication with Minimum Start-Up Costs in 2D/3D Tori and Meshes,” IEEE Trans. Parallel and Distributed Systems, vol. 9, no. 5, pp. 442-458, May 1998.
[8] Y.J. Suh and K.G. Shin, “All-to-All Personalized Communication in Multidimensional Torus and Mesh Networks,” IEEE Trans. Parallel and Distributed Systems, vol. 12, no. 1, pp. 38-59, Jan. 2001.
[9] Y.J. Suh and S. Yalamanchili, “Configurable Algorithms for Complete Exchange in 2D Meshes,” IEEE Trans. Parallel and Distributed Systems, vol. 11, no. 4, pp. 337-356, Apr. 2000.
[10] S.L. Johnsson and C.T. Ho,“Spanning graphs for optimum broadcasting and personalizedcommunication in hypercubes,” IEEE Trans. Computers, vol. 38, no. 9, pp. 1,249-1,268, Sept. 1989.
[11] J. Bruck,C.T. Ho,S. Kipnis,, and D. Weathersby,“Efficient algorithms for all-to-all communications in multiportmessage-passing systems,” Sixth Ann. Symp. Parallel Algorithms and Architectures, ACM, pp. 298-309, June 1994.
[12] Y. Yang and J. Wang, “Optimal All-to-All Personalized Exchange in Self-Routable Multistage Networks,” IEEE Trans. Parallel and Distributed Systems, vol. 11, no. 3, pp. 261-274, Mar. 2000.
[13] Y. Yang and J. Wang, “Optimal All-to-All Personalized Exchange in a Class of Optical Multistage Networks,” IEEE Trans. Parallel and Distributed Systems, vol. 12, no. 6, pp. 567-582, June 2001.
[14] Y. Saad and M.H. Schultz, “Data Communication in Parallel Architectures,” Parallel Computing, vol. 11, pp. 131-150, 1989.
[15] C. Calvin, S. Perennes, and D. Trystram, “All-to-All Broadcast in Torus with Wormhole-Like Routing,” Proc. Seventh IEEE Symp. Parallel and Distributed Processing, pp. 130-137, 1995.
[16] B. Juurlink, J.F. Sibeyn, and P.S. Rao, “Gossiping on Meshes and Tori,” IEEE Trans. Parallel and Distributed Systems, vol. 9, no. 6, pp. 513–525, June 1998.
[17] U. Meyer and J. F. Sibeyn, “Time-Independent Gossiping on Full-Port Tori,” Technical Report MPI-I-98-1-014, Max-Planck Inst. fur Informatik, Sept. 1998.
[18] M. Soch and P. Tvrdik, “Time-Optimal Gossip of Large Packets in Noncombining 2D Tori and Meshes,” IEEE Trans. Parallel and Distributed Systems, vol. 10, no. 12, pp. 1252-1261, Dec. 1999.
[19] Y. Yang and J. Wang, “Pipelined All-to-All Broadcast in All-Port Meshes and Tori,” IEEE Trans. Computers, vol. 50, no. 10, pp. 1020-1032, Oct. 2001.
[20] D. Gannon and J.V. Rosendale, “On the Impact of Communication Complexity in the Design of Parallel Numerical Algorithms,” IEEE Trans. Computer, vol. 33, pp. 1180-1194, Dec. 1984.
[21] S.L. Johnsson, "Communication Efficient Basic Linear Algebra Computations on Hypercube Architectures," J. Parallel and Distributed Computing, vol. 4, pp. 133-172, 1987.
[22] G. Fox,M. Johnson,G. Lyzenga,S. Otto,J. Salmon,, and D. Walker,Solving Problems on Concurrent Processors, Vol. I: General Techniques andRegular Problems.Englewood Cliffs, N.J.: Prentice Hall 1988.

