This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
FC3D: Flow Control-Based Distributed Deadlock Detection Mechanism for True Fully Adaptive Routing in Wormhole Networks
August 2003 (vol. 14 no. 8)
pp. 765-779
Pedro L?pez, IEEE Computer Society
Jos? Duato, IEEE

Abstract—Two general approaches have been proposed for deadlock handling in wormhole networks. Traditionally, deadlock-avoidance strategies have been used. In this case, either routing is restricted so that there are no cyclic dependencies between channels or cyclic dependencies between channels are allowed provided that there are some escape paths to avoid deadlock. More recently, deadlock recovery strategies have begun to gain acceptance. These strategies allow the use of unrestricted fully adaptive routing, usually outperforming deadlock avoidance techniques. However, they require a deadlock detection mechanism and a deadlock recovery mechanism that is able to recover from deadlocks faster than they occur. In particular, progressive deadlock recovery techniques are very attractive because they allocate a few dedicated resources to quickly deliver deadlocked messages, instead of killing them. Unfortunately, distributed deadlock detection is usually based on crude time-outs, which detect many false deadlocks. As a consequence, messages detected as deadlocked may saturate the bandwidth offered by recovery resources, thus degrading performance. Additionally, the threshold required by the detection mechanism (the time-out) strongly depends on network load, which is not known in advance at the design stage. This limits the applicability of deadlock recovery on actual networks. In this paper, we propose a novel distributed deadlock detection mechanism that uses only local information, detects all the deadlocks, considerably reduces the probability of false deadlock detection over previously proposed techniques, and is not significantly affected by variations in message length and/or message destination distribution.

[1] K.V. Anjan and T.M. Pinkston, An Efficient Fully Adaptive Deadlock Recovery Scheme: DISHA Proc. 22nd Int'l Symp. Computer Architecture, June 1995.
[2] K.V. Anjan, T.M. Pinkston, and J. Duato, Generalized Theory for Deadlock-Free Adaptive Routing and Its Application to Disha Concurrent Proc. 10th Int'l Parallel Processing Symp., Apr. 1996.
[3] W.C. Athas and C.L. Seitz, “Multicomputers: Message-Passing Concurrent Computers,” Computer, vol. 21, pp. 9-24, Aug. 1988.
[4] E. Baydal, P. López, and J. Duato, A Simple and Efficient Mechanism to Prevent Saturation in Wormhole Networks Proc. Int'l Parallel and Distributed Processing Symp., May 2000.
[5] E. Baydal, P. López, and J. Duato, Avoiding Network Congestion with Local Information Proc. Int'l Symp. High Performance Computing, May 2002.
[6] N. Boden et al., "Myrinet: A Gigabit-per-Second Local Area Network," IEEE Micro, Feb. 1995, pp. 29-36.
[7] W.J. Dally and C.L. Seitz, “Deadlock-Free Message Routing in Multiprocessor Interconnection Networks,” IEEE Trans. Computers, Vol. C-36, No. 5, May 1987, pp. 547-553.
[8] J. Duato, “A Necessary and Sufficient Condition for Deadlock-Free Adaptive Routing in Wormhole Networks,” IEEE Trans. Parallel and Distributed Systems, vol. 6, no. 10, pp. 1,055–1,067, Oct. 1995.
[9] J. Duato, S. Yalamanchili, and L.M. Ni, Interconnection Networks: An Engineering Approach. Morgan Kauffman, 2003.
[10] R. Horst, “ServerNet Deadlock Avoidance and Fractahedral Topologies,” Proc. Int'l Parallel Processing Symp., pp. 274–280, Apr. 1996.
[11] J.H. Kim, Z. Liu, and A.A. Chien, "Compressionless Routing: A Framework for Adaptive and Fault Tolerant Routing," Proc. 21st Ann. Int'l Symp. Computer Architecture, pp. 289-300, Apr. 1994.
[12] D. Lenoski et al., “The Stanford DASH Multiprocessor,” Computer, pp. 63-79, Mar. 1992.
[13] P. López, J.M. Martínez, and J. Duato, A Very Efficient Distributed Deadlock Detection Mechanism for Wormhole Networks Proc. High Performance Computer Architecture Symp., Feb. 1998.
[14] P. López, J.M. Martínez, and J. Duato, DRIL: Dynamically Reduced Message Injection Limitation Mechanism for Wormhole Networks Proc. Int'l Conf. Parallel Processing, pp. 535-542, Aug. 1998.
[15] J.M. Martínez, P. López, and J. Duato, A Cost-Effective Approach to Deadlock Handling in Wormhole Networks IEEE Trans. Parallel and Distributed Systems, vol. 12, no. 7, pp. 719-729, July 2001.
[16] T.M. Pinkston and S. Warnakulasuriya, Characterization of Deadlocks in K-Ary N-Cube Networks IEEE Trans. Parallel and Distributed Systems, vol. 10, no. 9, pp 38-49, Sept. 1999.
[17] D.S. Reeves, E.F. Gehringer, and A. Chandiramani, Adaptive Routing and Deadlock Recovery: A Simulation Study Proc. Fourth Conf. Hypercube, Concurrent Computers, and Applications, Mar. 1989.
[18] M.D. Schroeder, et al., Autonet: A High-Speed, Self-Configuring Local Area Network Using Point-to-Point Links Technical Report SRC Research Report 59, DEC, Apr. 1990.
[19] S. L. Scott, J.R. Goodman, The Impact of Pipelined Channels on K-Ary N-Cube Networks IEEE Trans. Parallel and Distributed Systems, vol. 5, no. 1, pp. 2-16, Jan. 1994.
[20] S.L. Scott and G. Thorson, The Cray T3E Networks: Adaptive Routing in a High Performance 3-D Torus Proc. Fourth Symp. Hot Interconnects, Aug. 1996.
[21] F. Silla, M.P. Malumbres, A. Robles, P. López, and J. Duato, Efficient Adaptive Routing in Networks of Workstations with Irregular Topology Proc. Workshop Comm. and Architectural Support for Network-Based Parallel Computing, Feb. 1997.
[22] F. Silla, A. Robles, and J. Duato, Improving Performance of Networks of Workstations by Using Disha Concurrent Proc. Int'l Conf. Parallel Processing, Aug. 1998.

Index Terms:
Wormhole switching, deadlock detection, deadlock recovery, true fully adaptive routing.
Citation:
Juan-Miguel Mart?nez Rubio, Pedro L?pez, Jos? Duato, "FC3D: Flow Control-Based Distributed Deadlock Detection Mechanism for True Fully Adaptive Routing in Wormhole Networks," IEEE Transactions on Parallel and Distributed Systems, vol. 14, no. 8, pp. 765-779, Aug. 2003, doi:10.1109/TPDS.2003.1225056
Usage of this product signifies your acceptance of the Terms of Use.