This Article 
 Bibliographic References 
 Add to: 
Exploiting Global Knowledge to Achieve Self-Tuned Congestion Control for k-Ary n-Cube Networks
March 2004 (vol. 15 no. 3)
pp. 257-272

Abstract—Network performance in tightly-coupled multiprocessors typically degrades rapidly beyond network saturation. Consequently, designers must keep a network below its saturation point by reducing the load on the network. Congestion control via source throttling—a common technique to reduce the network load—prevents new packets from entering the network in the presence of congestion. Unfortunately, prior schemes to implement source throttling either lack vital global information about the network to make the correct decision (whether to throttle or not) or depend on specific network parameters, or communication patterns. This paper presents a global-knowledge-based, self-tuned, congestion control technique that prevents saturation at high loads across different communication patterns for k{\hbox{-}\rm ary}n{\hbox{-}\rm cube} networks. Our design is composed of two key components. First, we use global information about a network to obtain a timely estimate of network congestion. We compare this estimate to a threshold value to determine when to throttle packet injection. The second component is a self-tuning mechanism that automatically determines appropriate threshold values based on throughput feedback. A combination of these two techniques provides high performance under heavy load, does not penalize performance under light load, and gracefully adapts to changes in communication patterns.

[1] D. Basak and D.K. Panda, “Alleviating Consumption Channel Bottleneck in Wormhole-Routed$k$-Ary$n$-Cube System,” IEEE Trans. Parallel and Distributed Systems, vol. 9, pp. 481–496, May 1998.
[2] E. Baydal, P. López, and J. Duato, A Simple and Efficient Mechanism to Prevent Saturation in Wormhole Networks Proc. Int'l Parallel and Distributed Processing Symp., May 2000.
[3] L.S. Brakmo and L.L. Peterson, TCP Vegas: End-to-End Congestion Avoidance on a Global Internet IEEE J. Selected Areas in Comm., vol. 13, no. 8, pp. 1465-1480, Oct. 1995.
[4] W.J. Dally, "Virtual-Channel Flow Control," IEEE Trans. Parallel and Distributed Systems, vol. 3, no. 2, pp. 194-205, Mar. 1992.
[5] W.J. Dally and H. Aoki, "Deadlock-Free Adaptive Routing in Multicomputer Networks Using Virtual Channels," IEEE Trans. Parallel and Distributed Systems, vol. 4, no. 4, pp. 466-475, Apr. 1993.
[6] W.J. Dally and C.L. Seitz, The TORUS Routing Chip J. Distributed Computing, vol. 1, no. 3, pp. 187-196, Oct. 1986.
[7] W.J. Dally and C.L. Seitz, Deadlock-Free Message Routing in Multiprocessor Interconnection Networks IEEE Trans. Computers, vol. 36, no. 5, pp. 547-553, May 1987.
[8] K. Diefendorff, Power4 Focuses on Memory Bandwidth Microprocessor Report, vol. 13, no. 13, Oct. 1999.
[9] J. Duato, "A New Theory of Deadlock-Free Adaptive Routing in Wormhole Networks," IEEE Trans. Parallel and Distributed Systems, vol. 4, no. 12, pp. 1,320-1,331, Dec. 1993.
[10] J.S. Emer, Simultaneous Multithreading: Multiplying Alpha Performance Microprocessor Forum, Oct. 1999.
[11] S. Floyd, TCP and Explicit Congestion Notification ACM Computer Comm. Rev., vol. 24, no. 5, pp. 10-23, Oct. 1994.
[12] S. Floyd and V. Jacobson, “Random Early Detection Gateways for Congestion Avoidance,” IEEE/ACM Trans. Networking, vol. 1, pp. 397-413, Aug. 1993.
[13] M. Galles, “Spider: A High Speed Network Interconnect” IEEE Micro, vol. 17, no. 1, pp. 34–39 Jan.-Feb. 1997.
[14] P.T. Gaughan and S. Yalamanchili, “Adaptive Routing Protocols for Hypercube Interconnection Networks,” Computer, vol. 26, no. 5, pp. 12–23, May 1993.
[15] HIPPI Standards Committee, High-Performance Parallel Interface-6400 Mbit/s Physical Layer (HIPPI-6400-PH) http://www.hippi.orgc6400PH.html, 2003.
[16] V. Jacobson, Congestion Avoidance and Control Proc. ACM SIGCOMM '88 Symp., pp. 314-329, Aug. 1988.
[17] R. Jain, Congestion Control and Traffic Management in ATM Networks: Recent Advances and a Survey Computer Networks and ISDN Systems, Oct. 1996.
[18] P. Kermani and L. Kleinrock, Virtual Cut-through: A New Computer Communication Switching Technique Computer Networks, vol. 3, pp. 267-286, 1979.
[19] J.H. Kim, Z. Liu, and A.A. Chien, "Compressionless Routing: A Framework for Adaptive and Fault Tolerant Routing," Proc. 21st Ann. Int'l Symp. Computer Architecture, pp. 289-300, Apr. 1994.
[20] K.V. Anjan and T.M. Pinkston, “An Efficient, Fully Adaptive Deadlock Recovery Scheme:DISHA,” Proc. 22nd Int'l Symp. Computer Architecture, pp. 201-210, June 1995.
[21] J. Laudon and D. Lenoski, “The SGI Origin: A CC-NUMA Highly Scalable Server,” Proc. 24th Ann. Int'l Symp. Computer Architecture (ISCA '97), May 1997.
[22] P. López, J.M. Martínez, and J. Duato, DRIL: Dynamically Reduced Message Injection Limitation Mechanism for Wormhole Networks Proc. Int'l Conf. Parallel Processing, pp. 535-542, Aug. 1998.
[23] P. Lopez, J.M. Martinez, J. Duato, and F. Petrini, On the Reduction of Deadlock Frequency by Limiting Message Injection in Wormhole Networks Proc. Parallel Computer Routing and Comm. Workshop, June 1997.
[24] L.-S. Peh and W.J. Dally, "Flit-Reservation Flow Control," Proc. Sixth Int'l Symp. High-Performance Computer Architecture, IEEE CS Press, Los Alamitos, Calif., 2001, pp. 73-84.
[25] G.F. Pfister and V.A. Norton, Hot-Spot Contention and Combining in Multistage Interconnection Networks IEEE Trans. Computers, vol. 34, no. 10, pp. 943-948, Oct. 1985.
[26] K.K. Ramakrishnan and R. Jain, A Binary Feedback Scheme for Congestion Avoidance in Computer Networks ACM Trans. Computer Systems, vol. 8, no. 2, pp. 158-181, 1990.
[27] D.J. Scales, K. Gharachorloo, and C.A. Thekkath, Shasta: A Low Overhead, Software-Only Approach for Supporting Fine-Grain Shared Memory Proc. Seventh Symp. Architectural Support for Programming Languages and Operating Systems, pp. 174-185, 1996.
[28] I. Schoinas, B. Falsafi, A.R. Lebeck, S.K. Reinhardt, J.R. Larus, and D.A. Wood, Fine-Grain Access Control for Distributed Shared Memory Proc. Sixth Int'l Conf. Architectural Support for Programming Languages and Operating Systems, pp. 297-306, 1994.
[29] S.L. Scott and G.S. Sohi, "The Use of Feedback in Multiprocessors and Its Application to Tree Saturation Control," IEEE Trans. Parallel and Distributed Systems, Vol. 1, No. 4, Oct. 1990, pp. 385-398.
[30] S.L. Scott, Synchronization and Communication in the T3E Multiprocessor Proc. Seventh Int'l Conf. Architectural Support for Programming Languages and Operating Systems, pp. 26-36, Oct. 1996.
[31] A. Smai and L. Thorelli, Global Reactive Congestion Control in Multicomputer Networks Proc. Fifth Int'l Conf. High Performance Computing, pp. 179-186, 1998.
[32] The Superior Multiprocessor ARchiTecture (SMART) Interconnects Group FlexSim, Electrical Eng.-Systems Dept., Univ. of Southern California, , 2003.
[33] S. Warnakulasuriya and T.M. Pinkston, “Characterization of Deadlocks in Interconnection Networks,” Proc. 11th Int'l Parallel Processing Symp., Apr. 1997.

Index Terms:
Interconnection networks, wormhole, k{\hbox{-}\rm ary}n{\hbox{-}\rm cubes}, congestion control, global information, self-tuning.
Mithuna Thottethodi, Alvin R. Lebeck, Shubhendu S. Mukherjee, "Exploiting Global Knowledge to Achieve Self-Tuned Congestion Control for k-Ary n-Cube Networks," IEEE Transactions on Parallel and Distributed Systems, vol. 15, no. 3, pp. 257-272, March 2004, doi:10.1109/TPDS.2004.1264810
Usage of this product signifies your acceptance of the Terms of Use.