This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
All-To-All Communication with Minimum Start-Up Costs in 2D/3D Tori and Meshes
May 1998 (vol. 9 no. 5)
pp. 442-458

Abstract—All-to-all communication patterns occur in many important parallel algorithms. This paper presents new algorithms for all-to-all communication patterns (all-to-all broadcast and all-to-all personalized exchange) for wormhole switched 2D/3D torus- and mesh-connected multiprocessors. The algorithms use message combining to minimize message start-ups at the expense of larger message sizes. The unique feature of these algorithms is that they are the first algorithms that we know of that operate in a bottom-up fashion rather than a recursive, top-down manner. For a 2d× 2d torus or mesh, the algorithms for all-to-all personalized exchange have time complexity of O(23d). An important property of the algorithms is the O(d) time due to message start-ups, compared with O(2d) for current algorithms [15], [18]. This is particularly important for modern parallel architectures where the start-up cost of message transmissions still dominates, except for very large block sizes. Finally, the 2D algorithms for all-to-all personalized exchange are extended to O(24d) algorithms in a 2d× 2d× 2d 3D torus or mesh. These algorithms also retain the important property of O(d) time due to message start-ups.

[1] S.H. Bokhari, H. Berryman, "Complete Exchange on a Circuit Switched Mesh," Proc. Scalable High Performance Computing Conf., pp. 300-306, 1992.
[2] S.H. Bokhari, "Multiphase Complete Exchange on Paragon, SP2, and CS-2," IEEE Parallel and Distributed Technology, pp. 45-59, Fall 1996.
[3] J. Bruck,C.T. Ho,S. Kipnis,, and D. Weathersby,“Efficient algorithms for all-to-all communications in multiportmessage-passing systems,” Sixth Ann. Symp. Parallel Algorithms and Architectures, ACM, pp. 298-309, June 1994.
[4] W.J. Dally, "Performance Analysis of k-ary n-Cube Interconnection Networks," IEEE Trans. Computers, vol. 39, no. 6, pp. 775-785, June 1992.
[5] S. Hinrichs, C. Kosak, D.R. O'Hallaron, T.M. Sticker, and R. Take, "An Architecture for Optimal All-to-All Personalized Communication," Proc. Symp. Parallel Algorithms and Architectures, pp. 310-319, 1994.
[6] S.L. Johnsson and C.T. Ho,“Spanning graphs for optimum broadcasting and personalizedcommunication in hypercubes,” IEEE Trans. Computers, vol. 38, no. 9, pp. 1,249-1,268, Sept. 1989.
[7] P.K. McKinley, Y.-J. Tsai, and D.F. Robinson, "Collective Communication Trees in Wormhole-Routed Massively Parallel Computers," Technical Report MSU-CPS-95-6, Michigan State Univ., Mar. 1995.
[8] P.K. McKinley, Y.-J. Tsai, and D. Robinson, "Collective Communication in Wormhole-routed Massively Parallel Computers," Computer, vol. 28, no. 12, pp. 39-50, Dec. 1995.
[9] L.M. Ni and P.K. McKinley, "A Survey of Wormhole Routing Techniques in Direct Networks," Computer, vol. 26, no. 2, pp. 62-76, Feb. 1993.
[10] D.K. Panda, "Issues in Designing Efficient and Practical Algorithms for Collective Communication on Wormhole-Routed Systems," Technical Report TR-25, Dept. of Computer and Information Science, Ohio State Univ., May 1995.
[11] D.S. Scott, "Efficient All-to-All Communication Patterns in Hypercube and Mesh Topologies," Proc. Sixth Conf. Distributed Memory Concurrent Computers, pp. 398-403, 1991.
[12] Y.J. Suh and S. Yalamanchili, "Algorithms for All-to-All Personalized Exchange in 2D and 3D Tori," Proc. 10th Int'l Parallel Processing Symp., pp. 808-814, Apr. 1996.
[13] Y.J. Suh and S. Yalamanchili, "Efficient Algorithms for Complete Exchange in Non-Power-of-Two 2D Tori," Proc. Ninth Int'l Conf. Parallel and Distributed Computing and Systems, pp. 113-119, 1997.
[14] Y.J. Suh, K.G. Shin, and S. Yalamanchili, "Complete Exchange in General Multidimensional Mesh Networks," Proc. 10th Int'l Conf. Parallel and Distributed Computing Systems, pp. 65-71, 1997.
[15] N.S. Sundar, D.N. Jayasimha, D.K. Panda, and P. Sadayappan, "Complete Exchange in 2D Meshes," Proc. Scalable High Performance Computing Conf., pp. 406-413, 1994.
[16] R. Thakur and A. Choudhary, "All-to-All Communication on Meshes with Wormhole Routing," Proc. Eighth Int'l Parallel Processing Symp., pp. 561-565, Apr. 1994.
[17] Y.-C. Tseng and S. Gupta, "All-to-All Personalized Communication in a Wormhole-Routed Torus," Proc. Int'l Conf. Parallel Processing, vol. 1, pp. 76-79, 1995.
[18] Y.-C. Tseng, S. Gupta, and D. Panda, "An Efficient Scheme for Complete Exchange in 2D Tori," Proc. Int'l Parallel Processing Symp. pp. 532-536, 1995.
[19] Message Passing Interface Forum, "MPI: A Message-Passing Interface Standard," Technical Report CS-93-214, Univ. of Tennessee, Apr. 1994.

Index Terms:
Interprocessor communication, parallel algorithms, collective communication, all-to-all communication, all-to-all broadcast, all-to-all personalized exchange, complete exchange.
Citation:
Young-Joo Suh, Sudhakar Yalamanchili, "All-To-All Communication with Minimum Start-Up Costs in 2D/3D Tori and Meshes," IEEE Transactions on Parallel and Distributed Systems, vol. 9, no. 5, pp. 442-458, May 1998, doi:10.1109/71.679215
Usage of this product signifies your acceptance of the Terms of Use.