This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A Progressive Approach to Handling Message-Dependent Deadlock in Parallel Computer Systems
March 2003 (vol. 14 no. 3)
pp. 259-275

Abstract—Handling deadlocks is essential for providing reliable communication paths between processing nodes in parallel computer systems. The existence of multiple message types and associated intermessage dependencies may cause message-dependent deadlocks in networks that are designed to be free of routing deadlock. Most methods currently used for dealing with message-dependent deadlocks require more system resources than are necessary and/or do not use system resources efficiently. This may have an adverse effect on system performance if resources are scarce. In this paper, we characterize the frequency of message-dependent deadlocks in multiprocessor/multicomputer systems. We also propose a handling technique for message-dependent deadlocks based on progressive deadlock recovery and evaluate its performance with other approaches. Results show that message-dependent deadlocks occur very infrequently under typical circumstances thus, rendering approaches based on avoiding them overly restrictive in the common case. The proposed technique relaxes restrictions considerably, allowing the routing of packets and the handling of message-dependent deadlocks to be much more efficient—particularly when network resources are scarce.

[1] J. Duato, "A New Theory of Deadlock-Free Adaptive Routing in Wormhole Networks," IEEE Trans. Parallel and Distributed Systems, vol. 4, no. 12, pp. 1,320-1,331, Dec. 1993.
[2] L. Schwiebert and D.N. Jayasimha, "A Necessary and Sufficient Condition for Deadlock-Free Wormhole Routing," J. Parallel and Distributed Computing, vol. 32, no. 1, pp. 103-117, Jan. 1996.
[3] J.H. Kim, Z. Liu, and A.A. Chien., "Compressionless Routing: A Framework for Fault-Tolerant Routing," IEEE Trans. Parallel and Distributed Systems, vol. 8, no. 3, pp. 229-244, Mar. 1997.
[4] T.M. Pinkston, “Flexible and Efficient Routing Based on Progressive Deadlock Recovery,” IEEE Trans. Computers, vol. 48, no. 7, pp. 649-669, July 1999.
[5] L. Widdoes, Jr. and S. Correll, “The S-1 Project: Developing High Performance Computers,” Proc. COMPCON, pp. 282-291, Spring 1980.
[6] S. Warnakulasuriya and T.M. Pinkston, A Formal Model of Message Blocking and Deadlock Resolution in Interconnection Networks IEEE Trans. Parallel and Distributed Systems, vol. 11, no. 2, pp. 212-229, Mar. 2000.
[7] W.J. Dally and C.L. Seitz, “Deadlock-Free Message Routing in Multiprocessor Interconnection Networks,” IEEE Trans. Computers, Vol. C-36, No. 5, May 1987, pp. 547-553.
[8] J. Duato, “A Necessary and Sufficient Condition for Deadlock-Free Adaptive Routing in Wormhole Networks,” IEEE Trans. Parallel and Distributed Systems, vol. 6, no. 10, pp. 1,055–1,067, Oct. 1995.
[9] C. Stunkel, D. Shea, B. Abali, M. Atkins, C. Bender, D. Grice, P. Hochshild, D. Joseph, B. Nathanson, R. Swetz, R. Stucke, M. Tsao, and P. Varker, “The SP2 High-Performance Switch,” IBM Systems J., vol. 34, no. 2,pp. 185–204, 1995.
[10] A. Agarwal et al. The MIT Alewife Machine: Architecture and Performance Proc. 22nd Int'l Symp. Computer Architecture, pp. 2-13, June 1995.
[11] S.L. Scott and G. Thorson, “Optimized Routing in the Cray T3D,” Proc. Workshop Parallel Computer Routing and Comm., pp. 281–294, May 1994.
[12] S. Scott and G. Thorson, “The Cray T3E Network: Adaptive Routing in a High Performance 3D Torus,” Proc. Symp. Hot Interconnects IV, pp. 147-156, Aug. 1996.
[13] J. Carbonaro, “Cavallino: The Teraflops Router and NIC,” Proc. Symp. Hot Interconnects IV, pp. 157-160, Aug. 1996.
[14] S.S. Mukherjee et al., "The Alpha 21364 Network Architecture," Proc. 9th Symp. High-Performance Interconnects (HOTI 01), IEEE CS Press, 2001, pp. 113-118.
[15] J.F. Martinez, J. Torrellas, and J. Duato, “Improving the Performance of Bristled CC-NUMA Systems Using Virtual Channels and Adaptivity,” Proc. 1999 ACM Int'l Conf. Supercomputing, 1999.
[16] J. Laudon and D. Lenoski, “The SGI Origin: A CC-NUMA Highly Scalable Server,” Proc. 24th Ann. Int'l Symp. Computer Architecture (ISCA '97), May 1997.
[17] D. Lenoski et al., "The directory-based cache coherence protocol for the dash multiprocessor," Proc. 17th Int'l Symp. Computer Architecture,Los Alamitos, Calif., pp. 148-159, 1990.
[18] T.M. Pinkston and S. Warnakulasuriya, Characterization of Deadlocks in K-Ary N-Cube Networks IEEE Trans. Parallel and Distributed Systems, vol. 10, no. 9, pp 38-49, Sept. 1999.
[19] K.V. Anjan and T.M. Pinkston, “An Efficient, Fully Adaptive Deadlock Recovery Scheme:DISHA,” Proc. 22nd Int'l Symp. Computer Architecture, pp. 201-210, June 1995.
[20] “InfiniBand Trade Association,” InfiniBand Architecture. Specification Volume 1. Release 1.0a, available athttp:/www.infinibandta.com, 2001.
[21] S. Warnakulasuriya and T.M. Pinkston, “Characterization of Deadlocks in Irregular Networks,” J. Parallel and Distributed Computing, vol. 62, no. 1, pp. 61-84, Jan. 2002.
[22] L. Gravano, G.D. Pifarré, P.E. Berman, and J.L.C. Sanz, “Adaptive Deadlock- and Livelock-Free Routing with All Minimal Paths in Torus Networks,” IEEE Trans. Parallel and Distributed Systems, vol. 5, no. 12, pp. 1,233–1,251, Dec. 1994.
[23] A.A. Chien and J.H. Kim, "Planar-Adaptive Routing: Low-Cost Adaptive Networks for Multiprocessors," Proc. 19th Int'l Symp. Computer Architecture, vol. 20, no. 2, pp. 268-277, May 1992.
[24] P. Palazzari and M. Coli, “Virtual Cut-Through Implementation of the Hole-Based Packet Switching Routing Algorithm,” Proc. Sixth Euromicro Workshop Parallel and Distributed Processing, pp. 416-421, Jan. 1998.
[25] R.C. Holt, “Some Deadlock Properties of Computer Systems,” ACM Computing Surveys, vol. 4, no. 3, pp. 179-195, 1972.
[26] K.M. Chandy and J. Misra, "A Distributed Graph Algorithm: Knot Detection," ACM Trans. Program Language Systems, vol. 4, pp. 678-686, Oct. 1982.

Index Terms:
Interconnection network, deadlock-free routing, message dependency, parallel processing.
Citation:
Yong Ho Song, Timothy Mark Pinkston, "A Progressive Approach to Handling Message-Dependent Deadlock in Parallel Computer Systems," IEEE Transactions on Parallel and Distributed Systems, vol. 14, no. 3, pp. 259-275, March 2003, doi:10.1109/TPDS.2003.1189584
Usage of this product signifies your acceptance of the Terms of Use.