This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Multidestination Message Passing in Wormhole k-ary n-cube Networks with Base Routing Conformed Paths
January 1999 (vol. 10 no. 1)
pp. 76-96

Abstract—This paper proposes multidestination message passing on wormhole k-ary n-cube networks using a new base-routing-conformed-path (BRCP) model. This model allows both unicast (single-destination) and multidestination messages to co-exist in a given network without leading to deadlock. The model is illustrated with several common routing schemes (deterministic, as well as adaptive), and the associated deadlock-freedom properties are analyzed. Using this model, a set of new algorithms for popular collective communication operations, broadcast and multicast, are proposed and evaluated. It is shown that the proposed algorithms can considerably reduce the latency of these operations compared to the Umesh (unicast-based multicast) [1] and the Hamiltonian path-based [2] schemes. A very interesting result that is presented shows that a multicast can be implemented with reduced or near-constant latency as the number of processors participating in the multicast increases beyond a certain number. It is also shown that the BRCP model can take advantage of adaptivity in routing schemes to further reduce the latency of these operations. The multidestination mechanism and the BRCP model establish a new foundation to provide fast and scalable collective communication support on wormhole-routed systems.

[1] P.K. McKinley et al., "Unicast-Based Multicast Communication in Wormhole-Routed Networks," IEEE Trans. Parallel and Distributed Systems, vol. 5, no. 12, Dec. 1994, pp. 1252-1265.
[2] X. Lin and L. Ni, "Deadlock-Free Multicast Wormhole Routing in Multicomputer Networks," Proc. Int'l Symp. Computer Architecture, June 1991.
[3] W.J. Dally and C.L. Seitz, “Deadlock-Free Message Routing in Multiprocessor Interconnection Networks,” IEEE Trans. Computers, Vol. C-36, No. 5, May 1987, pp. 547-553.
[4] L.M. Ni and P.K. McKinley, "A Survey of Wormhole Routing Techniques in Direct Networks," Computer, vol. 26, no. 2, pp. 62-76, Feb. 1993.
[5] MPI: A Message-Passing Interface Standard, Message Passing Interface Forum, Mar. 1994.
[6] D.K. Panda, “Issues in Designing Efficient and Practical Algorithms for Collective Communication on Wormhole-Routed Systems,” ICPP'95 Workshop Challenges for Parallel Processing, pp. 8-15, 1995.
[7] M. Barnett, S. Gupta, D. Payne, L. Shuler, R. van de Geijn, and J. Watts, “Interprocessor Collective Communication Library (InterCom),” Proc. Scalable High Performance Computing Conf., pp. 357-364, May 1994.
[8] A.A. Chien and J.H. Kim, "Planar-Adaptive Routing: Low-Cost Adaptive Networks for Multiprocessors," Proc. 19th Int'l Symp. Computer Architecture, vol. 20, no. 2, pp. 268-277, May 1992.
[9] J. Duato, "A New Theory of Deadlock-Free Adaptive Routing in Wormhole Networks," IEEE Trans. Parallel and Distributed Systems, vol. 4, no. 12, pp. 1,320-1,331, Dec. 1993.
[10] C.J. Glass and L.M. Ni, "The Turn Model for Adaptive Routing," Proc. 19th Int'l Symp. Computer Architecture, vol. 20, no. 2, pp. 278-287, May 1992.
[11] D.K. Panda, "Global Reduction in Wormhole k-Ary n-Cube Networks with Multidestination Exchange Worms," Proc. Int'l Parallel Processing Symp., pp. 652-659, Apr. 1995.
[12] C.M. Chiang and L.M. Ni, "Multi-Address Encoding for Multicast," Proc. Parallel Computer Routing and Comm. Workshop, pp. 146-160, May 1994.
[13] D.K. Panda, "Fast Barrier Synchronization in Wormhole k-Ary n-Cube Networks with Multidestination Worms," Proc. Int'l Symp. High Performance Computer Architecture, pp. 200-209, 1995.
[14] J. Duato, S. Yalamanchili, and L.M. Ni, Interconnection Networks: An Engineering Approach. Los Alamitos, Calif.: IEEE CS Press, 1997.
[15] R.V. Boppana, S. Chalasani, and C.S. Raghavendra, “On Multicast Wormhole Routing in Multicomputer Networks,” Proc. Symp. Parallel and Distributed Processing, pp. 722-729, 1994.
[16] S. Balakrishnan and D.K. Panda, “Impact of Multiple Consumption Channels on Wormhole Routed k-ary n-Cube Networks,” Proc. Int'l Parallel Processing Symp., pp. 163-167, 1993.
[17] J. Bruck,R. Cypher,, and C.T. Ho,“Multiple message broadcasting with generalized Fibonacci trees,” Fourth Symp. Parallel and Distributed Processing, IEEE, pp. 424-431, Dec. 1992.
[18] S.L. Johnsson and C.T. Ho,“Spanning graphs for optimum broadcasting and personalizedcommunication in hypercubes,” IEEE Trans. Computers, vol. 38, no. 9, pp. 1,249-1,268, Sept. 1989.
[19] C.-T. Ho and M.-Y. Kao, "Optimal Broadcast on Hypercubes with Wormhole and E-Cube Routing," Proc. Int'l Conf. Parallel and Distributed Systems, pp. 694-697, 1993.
[20] E. Fleury and P. Fraigniaud, "Multicasting in Meshes," Proc. Int'l Conf. Parallel Processing, pp. 151-158, 1994.
[21] R. Kesavan, K. Bondalapati, and D.K. Panda, “Multicast on Irregular Switch-Based Networks with Wormhole Routing,” Proc. Int'l Symp. High Performance Computer Architecture (HPCA-3), pp. 48-57, Feb. 1997.
[22] D.K. Panda, D. Basak, D. Dai, R. Kesavan, R. Sivaram, M. Banikazemi, and V. Moorthy, "Simulation of Modern Parallel Systems: A CSIM-Based Approach," Proc. 1997 Winter Simulation Conf. (WSC '97), pp. 1,013-1,020, Dec. 1997.
[23] R. Kesavan and D.K. Panda, “Minimizing Node Contention in Multiple Multicast on Wormholek-Aryn-Cube Networks,” Proc. Int'l Conf. Parallel Processing, vol. I, pp. 188-195, Aug. 1996.
[24] X. Lin, P.K. McKinley, and L.M. Ni, "Performance Evaluation of Multicast Wormhole Routing in 2D-Mesh Multicomputers," Proc. Int'l Conf. Parallel Processing, pp. I:435-442, 1991.
[25] D. Dai and D.K. Panda, “Reducing Cache Invalidation Overheads in Wormhole DSMs Using Multidestination Message Passing,” Proc. Int'l Conf. Parallel Processing, pp. I:138–145, Chicago, Ill., Aug. 1996.
[26] C.M. Chiang and L.M. Ni, "Deadlock-Free Multi-Head Wormhole Routing," Proc. First High Performance Computing-Asia, 1995.
[27] R. Sivaram, D.K. Panda, and C.B. Stunkel, “Efficient Broadcast and Multicast on Multistage Interconnection Networks Using Multiport Encoding,” IEEE Trans. Parallel and Distributed Systems, vol. 9, no. 10, pp. 1004-1028, Oct. 1998.
[28] C.B. Stunkel, R. Sivaram, and D.K. Panda, “Implementing MultiDestination Worms in Switch-Based Parallel Systems: Architectural Alternatives and Their Impact,” Proc. 24th IEEE/ACM Ann. Int'l Symp. Computer Architecture (ISCA-24), pp. 50-61, June 1997.
[29] R. Kesavan and D.K. Panda, "Multicasting on Switch-based Irregular Networks Using Multi-Drop Path-Based Multidestination Worms," Proc. Second Workshop Parallel Computer Routing and Comm. (PCRCW '97), pp. 217-230, June 1997.
[30] R. Sivaram, R. Kesavan, D. K. Panda, C. B. Stunkel, “Where to Provide Support for Efficient Multicasting in Irregular Networks: Network Interface or Switch?” Proc. 27th Int'l Conf. Parallel Processing (ICPP '98), pp. 452-459, Aug. 1998.
[31] R. Sivaram, D.K. Panda, and C.B. Stunkel, "Multicasting in Irregular Networks with Cut-Through Switches using Tree-Based Multidestination Worms," Proc. Second Parallel Computer Routing and Comm. Workshop (PCRCW '97), pp. 39-52, June 1997.

Index Terms:
Wormhole routing, collective communication, broadcast, multicast, k-ary n-cubes, meshes, interconnection networks, deadlock-freedom, and interprocessor communication.
Citation:
Dhabaleswar K. Panda, Sanjay Singal, Ram Kesavan, "Multidestination Message Passing in Wormhole k-ary n-cube Networks with Base Routing Conformed Paths," IEEE Transactions on Parallel and Distributed Systems, vol. 10, no. 1, pp. 76-96, Jan. 1999, doi:10.1109/71.744844
Usage of this product signifies your acceptance of the Terms of Use.