This Article 
 Bibliographic References 
 Add to: 
Optimal Multicast Communication in Wormhole-Routed Torus Networks
October 1995 (vol. 6 no. 10)
pp. 1029-1042

Abstract—This paper presents efficient algorithms that implement one-to-many, or multicast, communication in wormhole-routed torus networks. By exploiting the properties of the switching technology and the use of virtual channels, a minimum-time multicast algorithm is presented for n-dimensional torus networks that use deterministic, dimension-ordered routing of unicast messages. The algorithm can deliver a multicast message to m - 1 destinations in $\lceil {\bf log_2} \, {\mbi m} \rceil$ message-passing steps, while avoiding contention among the constituent unicast messages. Performance results of a simulation study on torus networks with up to 4096 nodes are also given.

[1] J. Dongarra et al.,“Document for a standard message-passing interface,” Message Passing Interface Forum, Univ. of Tennessee, Tech. Report CS-93-214, Nov. 1993.
[2] High Performance Fortran Forum, “Draft high performance Fortran language specification,” (version 1.0), May 1993.
[3] J. Choi,J.J. Dogarra,, and D.W. Walker,“PUMMA: Parallel universal matrix multiplication algorithms ondistributed memory concurrent computers,” Tech. Report ORNL/TM-12252, Oak Ridge National Laboratory, Aug. 1993.
[4] J. Choi,J.J. Dongarra,, and D.W. Walker,“Parallel matrix transpose algorithms on distributed memory concurrentcomputers,” Tech Report ORNL/TM-12309, Oak Ridge National Laboratory, Oct. 1993.
[5] J. Dongarra and R.A. van de Geijn,“Reduction to condensed form for the eigenvalue problem on distributedmemory architectures,” Parallel Computing, vol. 18, pp. 973-982, 1992.
[6] C. Trefftz,P.K. McKinley,T.Y. Li,, and Z. Zeng,“A scalable eigenvalue solver for symmetric tridiagonal matrices,” Proc. Sixth SIAM Conf. Parallel Processing, pp. 602-609, 1993.
[7] P.K. McKinley,H. Xu,E. Kalns,, and L.M. Ni,“ComPaSS: Efficient communication services for scalable architectures,” Proc. Supercomputing’92, pp. 478-487, Nov. 1992.
[8] J. Choi,J.J Dongarra,R. Pozo,, and D.W. Walker,“ScaLAPACK: A scalable linear algebra library for distributed memoryconcurrent computers,” Proc. Fourth Symp. Frontiers of Massively Parallel Computation, pp. 120-127, IEEE CS Press, 1992.
[9] H. Xu,P.K. McKinley,, and L.M. Ni,“Efficient implementation of barrier synchronization in wormhole-routedhypercube multicomputers,” J. Parallel and Distributed Computing, vol. 16, pp. 172-184, 1992.
[10] K. Li and R. Schaefer,“A hypercube shared virtual memory,” Proc. 1989 Int’l Conf. Parallel Processing, vol. 1, pp. 125-132, Aug. 1989.
[11] P.K. McKinley et al., "Unicast-Based Multicast Communication in Wormhole-Routed Networks," IEEE Trans. Parallel and Distributed Systems, vol. 5, no. 12, Dec. 1994, pp. 1252-1265.
[12] L.M. Ni and P.K. McKinley, "A Survey of Wormhole Routing Techniques in Direct Networks," Computer, vol. 26, no. 2, pp. 62-76, Feb. 1993.
[13] W.J. Dalley and C.L. Seitz,“The torus routing chip,”J. Distributed Computing, vol. 1, no. 3, pp. 187-196, 1986.
[14] W.J. Dally, "Performance Analysis of k-ary n-Cube Interconnection Networks," IEEE Trans. Computers, vol. 39, no. 6, pp. 775-785, June 1992.
[15] R.E. Kessler and J.L. Schwarzmeier, "CRAY T3D: A New Dimension for Cray Research," Proc. COMPCON, pp. 176-182, Feb. 1993.
[16] D.F. Robinson,P.K. McKinley,, and B.H.C. Cheng,“Optimal multicast communication in wormhole-routed torus networks,” Tech. Report MSU-CPS93-26, Dept. of Computer Science, Michigan State Univ., 1993.
[17] W.J. Dally and C.L. Seitz, “Deadlock-Free Message Routing in Multiprocessor Interconnection Networks,” IEEE Trans. Computers, Vol. C-36, No. 5, May 1987, pp. 547-553.
[18] W.J. Dally,“Virtual channel flow control,” IEEE Trans. Computers, vol. 3, pp. 194-205, Mar. 1992.

Index Terms:
Multicast, wormhole routing, torus topology, virtual channels, routing algorithms, dimension-ordered routing.
David F. Robinson, Philip K. McKinley, Betty H.C. Cheng, "Optimal Multicast Communication in Wormhole-Routed Torus Networks," IEEE Transactions on Parallel and Distributed Systems, vol. 6, no. 10, pp. 1029-1042, Oct. 1995, doi:10.1109/71.473513
Usage of this product signifies your acceptance of the Terms of Use.