This Article 
 Bibliographic References 
 Add to: 
Algebraic Foundations and Broadcasting Algorithms for Wormhole-Routed All-Port Tori
March 2000 (vol. 49 no. 3)
pp. 246-258

Abstract—The one-to-all broadcast is the most primary collective communication pattern in a multicomputer network. We consider this problem in a wormhole-routed torus which uses the all-port and dimension-ordered routing model. We derive our routing algorithms based on the concept of “span of vector spaces” in linear algebra. For instance, in a 3D torus, the nodes receiving the broadcast message will be “spanned” from the source node to a line of nodes, to a plane of nodes, and then to a cube of nodes. Our results require at most $2(k-1)$ steps more than the optimal number of steps for any square $k$-D torus. Existing results, as compared to ours, can only be applied to tori of very restricted dimensions or sizes and either rely on an undesirable non-dimension-ordered routing or require more numbers of steps.

[1] V. Bala,J. Bruck,R. Cypher,P. Elustondo,A. Ho,C.T. Ho,S. Kipnis,, and M. Snir,“CCL: A portable and tunable collective communication library forscalable parallel computers,” Eighth Int’l Parallel Processing Symp., IEEE, pp. 835-844, Apr. 1994.
[2] M. Barnett, S. Gupta, D. Payne, L. Shuler, R. van de Geijn, and J. Watts, “Interprocessor Collective Communication Library (InterCom),” Proc. Scalable High Performance Computing Conf., pp. 357-364, May 1994.
[3] Cray Reasearch Inc., “CRAY T3E Scalable Parallel Processing System,” 1995.
[4] W. Dally and C. Seitz, “The Torus Routing Chip,” J. Distributed Computing, vol. 1, no. 3, pp. 187-196, 1986.
[5] B. Duzett and R. Buck, "An Overview of the nCUBE3 Supercomputer," Proc. Fourth Symp. Frontiers of Massively Parallel Computation, pp. 458-464, 1992.
[6] R. L. Graham, D. E. Knuth, and O. Patashnik,Concrete Mathematics. Reading, MA: Addison-Wesley, 1989.
[7] C.-T. Ho and M.-Y. Kao, "Optimal Broadcast in All-Port Wormhole-Routed Hypercubes," IEEE Trans. Parallel and Distributed Systems, vol. 6, no. 2, pp. 200-318, Feb. 1995.
[8] Intel Corporation, “A Touchstone DELTA System Description,” 1990.
[9] S.L. Johnsson, "Communication Efficient Basic Linear Algebra Computations on Hypercube Architectures," J. Parallel and Distributed Computing, vol. 4, pp. 133-172, 1987.
[10] R.E. Kessler and J.L. Schwarzmeier, "CRAY T3D: A New Dimension for Cray Research," Proc. COMPCON, pp. 176-182, Feb. 1993.
[11] S.-K. Lee and J.-Y. Lee, “Optimal Broadcast in$\alpha$-Port Wormhole-Routed Mesh Networks,” Proc. Int'l Conf. Parallel and Distributed Systems, pp. 109-114, 1997.
[12] P.K. McKinley, Y.-J. Tsai, and D. Robinson, "Collective Communication in Wormhole-routed Massively Parallel Computers," Computer, vol. 28, no. 12, pp. 39-50, Dec. 1995.
[13] Message Passing Interface Forum, “Document for Standard Message-Passing Interface,” Nov. 1993.
[14] L.M. Ni and P.K. McKinley, "A Survey of Wormhole Routing Techniques in Direct Networks," Computer, vol. 26, no. 2, pp. 62-76, Feb. 1993.
[15] W.K. Nicholson, Linear Algebra with Applications, third ed. PWS Publishing, 1995.
[16] P.R. Nuth and W.J. Dally, “The J-Machine Network,” Proc. 1992 IEEE Int'l Conf. Computer Design: VLSI in Computers and Processors, pp. 420-423, Oct. 1992.
[17] J.-Y.L. Park and H.-A. Choi, "Circuit-Switched Broadcasting in Tours and Mesh Networks," IEEE Trans. Parallel and Distributed Systems, vol. 7, no. 2, pp. 184-190, Feb. 1996.
[18] J.L. Park, S.-K. Lee, and H.-A. Choi, “Circuit-Switched Broadcasting in$d$-Dimensional Torus and Mesh Networks,” Proc. Int'l Parallel Processing Symp., pp. 26-29, 1994.
[19] J. Peters and M. Syska, "Circuit-Switched Broadcasting in Torus Networks," IEEE Trans. Parallel and Distributed Systems, vol. 7, no. 3, pp. 246-255, 1996.
[20] Y.-J. Tsai and P.K. McKinley, "A Broadcasting Algorithm for All-Port Wormhole-Routed Torus Networks," Proc. Symp. Frontiers of Massively Parallel Computation, pp. 529-536, 1995.
[21] Y. Tsai and P.K. McKinley, "A Broadcast Algorithm for All-Port Wormhole-Routed Torus Network," IEEE Trans. Parallel and Distributed Systems, vol. 7, no. 8, pp. 876-885, Aug. 1996.
[22] Y -C. Tseng, “A Dilated-Diagonal-Based Scheme for Broadcast in a Wormhole-Routed 2D Torus,” IEEE Trans. Computers, vol. 46, no. 8, pp. 947-952, Aug. 1997.
[23] Y.-C. Tseng and S. Gupta, “All-to-All Personalized Communication in a Wormhole-Routed Torus,” IEEE Trans. Parallel and Distributed Systems, vol. 7, no. 5, pp. 498-505, May 1996.
[24] C.-M. Wang and C.-Y. Ku, "A Near-Optimal Broadcasting Algorithm in All-Port Wormhole-Routed Hypercubes," Proc. ACM Int'l Conf. Supercomputing, pp. 147-153, 1995.
[25] S.-Y. Wang, Y.-C. Tseng, S.-Y. Ni, and J.-P. Sheu, “Circuit-Switched Broadcasting in Multi-Port Multi-Dimensional Torus Networks,” Proc. Euro-Par Conf., pp. 1,209-1,221, 1999.
[26] H. Xu, P.K. McKinley, and L.M. Ni, “Efficient Implementation of Barrier Synchronization in Wormhole-Routed Hypercubes Multicomputers,” J. Parallel and Distributed Computing, vol. 16, pp. 172-184, 1992.

Index Terms:
Collective communication, interconnection network, one-to-all broadcast, parallel processing, torus, wormhole routing.
San-Yuan Wang, Yu-Chee Tseng, "Algebraic Foundations and Broadcasting Algorithms for Wormhole-Routed All-Port Tori," IEEE Transactions on Computers, vol. 49, no. 3, pp. 246-258, March 2000, doi:10.1109/12.841128
Usage of this product signifies your acceptance of the Terms of Use.