This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
An Extended Dominating Node Approach to Broadcast and Global Combine in Multiport Wormhole-Routed Mesh Networks
January 1997 (vol. 8 no. 1)
pp. 41-58

Abstract—A new approach to the design of collective communication operations in wormhole-routed mesh networks is described. The approach extends the concept of dominating sets in graph theory by accounting for the relative distance-insensitivity of the wormhole switching strategy and by taking advantage of a multiport communication architecture, which allows each node to simultaneously transmit messages on different outgoing channels. Collective communication operations are defined in terms of sets of extended dominating nodes (EDNs). The nodes in a set of EDNs can deliver (receive) messages to (from) a different, larger set of nodes in a single message-passing step under dimension-ordered wormhole routing and without channel contention among messages. The EDN model can be applied to different collective operations in 2D and 3D mesh networks. In this paper, we focus on EDN-based broadcast and global combine operations. Performance evaluation results are presented that confirm the advantage of this approach over other methods.

[1] M. Barnett, R. Littlefield, D.G. Payne, and R. van de Geijn, "Global combine on mesh architectures with wormhole routing," Seventh Int'l Parallel Processing Symp., IEEE, Newport Beach, Calif., Apr. 1993.
[2] M. Barnett, D.G. Payne, R.A. van de Geijn, and J. Watts, "Broadcasting on Meshes with Wormhole Routing," J. Parallel and Distributed Computing, vol. 35, no. 6, pp. 111-122, June 1996.
[3] A. Bar-Noy, J. Bruck, C.-T. Ho, S. Kipnis, and B. Schieber, "Computing Global Combine Operations in the Multiport Postal Model," IEEE Trans. Parallel and Distributed Systems, vol. 6, no. 8, pp. 896-900, Aug. 1995.
[4] D. Bertsekas, C. Ozveren, G. Stamoulis, P. Tseng, and J. Tsitsiklis, "Optimal Communication Algorithms for Hypercubes," J. Parallel and Distributed Computing, vol. 11, pp. 263-275, 1991.
[5] A.A. Chien, "A Cost and Speed Model for k-ary n-cube Wormhole Routers," Proc. Hot Interconnects '93, Aug.5-7, 1993.
[6] E.J. Cockayne, E.O. Hare, S.T. Hedetniemi, and E.V. Wimer, "Bounds for the Domination Number of Grid Graphs," Congressus Numerantium, vol. 47, pp. 217-228, 1985.
[7] CRAY T3D System Architecture Overview Manual. Cray Research, Inc., 1993.
[8] W.J. Dally, "Virtual-Channel Flow Control," IEEE Trans. Parallel and Distributed Systems, vol. 3, no. 2, pp. 194-205, Mar. 1992.
[9] W.J. Dally, "The Message-Driven Processor: A Multicomputer Processing Node with Efficient Mechanisms," IEEE Micro, pp. 23-39, Apr. 1992.
[10] W.J. Dally and C.L. Seitz, "The Torus Routing Chip," J. Distributed Computing, vol. 1, no. 3, pp. 187-196, Mar. 1986.
[11] B. Duzett and R. Buck, "An Overview of the nCUBE3 Supercomputer," Proc. Fourth Symp. Frontiers of Massively Parallel Computation, pp. 458-464, 1992.
[12] M.R. Garey and D.S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness.New York: W.H. Freeman, 1979.
[13] S. Hiranandani, K. Kennedy, and C.-W. Tseng, "Compiling Fortran D for MIMD Distributed-Memory Machines," Comm. ACM, vol. 35, no. 8, pp. 66-80, Aug. 1992.
[14] C.-T. Ho and M.-Y. Kao, "Optimal Broadcast in All-Port Wormhole-Routed Hypercubes," IEEE Trans. Parallel and Distributed Systems, vol. 6, no. 2, pp. 200-318, Feb. 1995.
[15] Paragon XP/S Product Overview. Intel Corporation, 1991.
[16] R.E. Kessler and J.L. Schwarzmeier, "CRAY T3D: A New Dimension for Cray Research," Proc. COMPCON, pp. 176-182, Feb. 1993.
[17] J.H. Kim, Z. Liu, and A.A. Chien, "Compressionless Routing: A Framework for Adaptive and Fault Tolerant Routing," Proc. 21st Ann. Int'l Symp. Computer Architecture, pp. 289-300, Apr. 1994.
[18] V. Kumar, A. Grama, A. Gupta, and G. Karypis, Introduction to Parallel Computing: Design and Analysis of Algorithms. Benjamin Cummings, 1994.
[19] P.K. McKinley and C. Trefftz, "Efficient Broadcast in All-Port Wormhole Routed Hypercubes," Proc. 1993 Int'l Conf. Parallel Processing, vol. 2, no. 8, pp. 288-291, Aug. 1993.
[20] L.A. Barroso and M. Dubois,“The performance of cache-coherent ring-based multiprocessors,” Proc. 20th Int’l Conf. of Computer Architectures, IEEE Computer Society Press, Apr. 1993, pp. 268-277.
[21] P.K. McKinley, Y.-J. Tsai, and D. Robinson, "Collective Communication in Wormhole-routed Massively Parallel Computers," Computer, vol. 28, no. 12, pp. 39-50, Dec. 1995.
[22] P.K. McKinley et al., "Unicast-Based Multicast Communication in Wormhole-Routed Networks," IEEE Trans. Parallel and Distributed Systems, vol. 5, no. 12, Dec. 1994, pp. 1252-1265.
[23] J. Dongarra et al.,“Document for a standard message-passing interface,” Message Passing Interface Forum, Univ. of Tennessee, Tech. Report CS-93-214, Nov. 1993.
[24] NCUBE 6400 Processor Manual. NCUBE Company, 1990.
[25] L.M. Ni and P.K. McKinley, "A Survey of Wormhole Routing Techniques in Direct Networks," Computer, vol. 26, no. 2, pp. 62-76, Feb. 1993.
[26] J.-Y.L. Park and H.-A. Choi, "Circuit-Switched Broadcasting in Tours and Mesh Networks," IEEE Trans. Parallel and Distributed Systems, vol. 7, no. 2, pp. 184-190, Feb. 1996.
[27] J.-Y.L. Park, S.-K. Lee, and H.-A. Choi, "Fault-Tolerant Broadcasting in Circuit-Switched Mesh," Proc. Sixth SIAM Conf. Parallel Processing for Scientific Computing, pp. 887-890, 1993.
[28] J. Peters and M. Syska, "Circuit-Switched Broadcasting in Torus Networks," IEEE Trans. Parallel and Distributed Systems, vol. 7, no. 3, pp. 246-255, 1996.
[29] D.F. Robinson, D. Judd, P.K. McKinley, and B.H.C. Cheng, "Efficient Multicast Communication in All-Port Wormhole-Routed Hypercubes," J. Parallel and Distributed Computing, vol. 31, no. 12, pp. 126-140, Dec. 1995.
[30] H.D. Schwetman, "Csim: A C-Based, Process Oriented Simulation Language," Technical Report PP-080-85, Microelectronics and Computer Technology Corp., 1985.
[31] C. L. Seitz,“The cosmic cube,”CACM, vol. 28, pp. 22–33, Jan. 1985.
[32] Y. Tsai and P.K. McKinley, "A Dominating Set Model for Broadcast in All-Port Wormhole-Routed 2D Mesh Networks," Technical Report MSU-CPS-93-23, Michigan State Univ., 1993.
[33] Y.J. Tsai and P.K. McKinkey, “An Extended Dominating Node Approach to Collective Communication in All-Port Wormhole-Routed 2D Meshes,” Proc. Scalable High-Performance Computing Conf., pp. 199–206, Oct. 1994.
[34] Y. Tsai and P.K. McKinley, "Extended Dominating Node Broadcast in All-Port Wormhole-Routed Torus Networks," Proc. Fifth Symp. Frontiers of Massively Parallel Computation, pp. 529-536, Feb. 1995.
[35] Y. Tsai and P.K. McKinley, "A Broadcast Algorithm for All-Port Wormhole-Routed Torus Network," IEEE Trans. Parallel and Distributed Systems, vol. 7, no. 8, pp. 876-885, Aug. 1996.
[36] H. Xu, P.K. McKinley, and L.M. Ni, "Efficient Implementation of Barrier Synchronization in Wormhole-Routed Hypercube Multicomputers," J. Parallel and Distributed Computing, vol. 16, no. 10, pp. 172-184, Oct. 1992.

Index Terms:
Collective communication, mesh networks, wormhole routing, multiport architecture, dominating set, broadcast, global combine.
Citation:
Yih-jia Tsai, Philip K. McKinley, "An Extended Dominating Node Approach to Broadcast and Global Combine in Multiport Wormhole-Routed Mesh Networks," IEEE Transactions on Parallel and Distributed Systems, vol. 8, no. 1, pp. 41-58, Jan. 1997, doi:10.1109/71.569654
Usage of this product signifies your acceptance of the Terms of Use.