This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A Family of Fault-Tolerant Routing Protocols for Direct Multiprocessor Networks
May 1995 (vol. 6 no. 5)
pp. 482-497

Abstract—Our goal is to reconcile the conflicting demands of performance and fault-tolerance in interprocessor communication. To this end, we propose a pipelined communication mechanism—pipelined circuit-switching (PCS)—which is a variant of the well known wormhole routing (WR) mechanism. PCS relaxes some of the routing constraints imposed by WR and as a result enables routing behavior that cannot otherwise be realized. This paper presents a new class of adaptive routing algorithms—misrouting backtracking with $m$ misroutes (MB-$m$). This class of routing algorithms is made possible by PCS. We provide an analysis of the performance and static fault-tolerant properties of MB-$m$. The results of an experimental evaluation of PCS and MB-3 are also presented. This methodology provides performance approaching that of WR, while realizing a level of resilience to static faults that is difficult to achieve with WR.

[1] P. Berman, L. Gravano, J. Sanz, and G. Pifarre, "Adaptive Deadlock- and Livelock-Free Routing with All Minimal Paths in Torus Networks," Proc. Fourth ACM Symp. Parallel Algorithms and Architectures, June 1992.
[2] S. Borkar et al., "Supporting Systolic and Memory Communication in iWarp," Proc. 17th Ann. Int'l Symp. Computer Architecture (ISCA 90), IEEE CS Press, 1990, pp. 70-81.
[3] M. S. Chen and K. G. Shin,“Adaptive fault-tolerant routing in hypercube multicomputers,”IEEE Trans. Comput., vol. 39, pp. 1406–1416, Dec. 1990.
[4] M.S. Chen and K.G. Shin, "Depth-First Search Approach for Fault-Tolerant Routing in Hypercube Multicomputers," IEEE Trans. Parallel and Distributed Systems, vol. 2, no. 4, Apr. 1990.
[5] A. A. Chien,“A cost and speed model for$k$-ary$n$-cube wormhole routers,”inProc. Hot Interconn. Workshop, Aug. 1993.
[6] A.A. Chien and J.H. Kim,“Planar-adaptive routing: Low-cost adaptive networks for multiprocessors,” Proc. 19th Int’l Symp. Computer Architecture, vol. 39, no. 6, pp. 775-785, June 1990.
[7] E. Chow, H. Madan, J. Peterson, D. Grunwald, and D. Reed,“Hyperswitch network for the hypercube computer,”inProc. 15th Annu. Int. Symp. Comput. Architect., May 1988, pp. 90–99.
[8] W.J. Dally,“Virtual channel flow control,” IEEE Trans. Computers, vol. 3, pp. 194-205, Mar. 1992.
[9] W. J. Dally and H. Aoki,“Deadlock-free adaptive routing in multicomputer networks using virtual channels,”IEEE Trans. Parallel, Distrib. Syst., 1993.
[10] W.J. Dally and C.L. Seitz, “Deadlock-Free Message Routing in Multiprocessor Interconnection Networks,” IEEE Trans. Computers, Vol. C-36, No. 5, May 1987, pp. 547-553.
[11] J. T. Draper and J. Ghosh,“Multipath$e$-cube algorithms (MECA) for adaptive wormhole routing and broadcasting in$k$-ary$n$-cubes,”inProc. 6th Int. Parallel Process. Symp., Mar. 1992, pp. 407–410.
[12] J. Duato,“Deadlock-free adaptive routing algorithms for multicomputers: Evaluation of a new algorithm,”inProc. IEEE Symp. Parallel, Distrib. Process., 1991, pp. 840–847.
[13] J. Duato, “On the Design of Deadlock-Free Adaptive Routing Algorithms for Multicomputers: Design Methodologies,” Proc. Parallel Architectures and Languages Europe 91, June 1991.
[14] D. Ferrari,Computer Systems Performance Evaluation. Englewood Cliffs, NJ: Prentice-Hall, 1978.
[15] P. T. Gaughan and S. Yalamanchili,“Pipelined circuit-switching: A fault-tolerant variant of wormhole routing,”inProc. IEEE Symp. Parallel, Distrib. Processing, Dec. 1992.
[16] ——,“Analytical models of bandwidth allocation in pipelined$k$-ary$n$-cubes,”inProc. 7th Int. Parallel Process. Symp., Apr. 1993.
[17] L. Ni and C. Glass, "The Turn Model for Adaptive Routing," Proc. Int'l Symp. Computer Architecture, 1992.
[18] J.M. Gordon and Q.F. Stout, “Hypercube Message Routing in the Presence of Faults,” Proc. Third Conf. Hypercube Concurrent Computers and Applications, pp. 318-327, Jan. 1988.
[19] D. Grunwald and D. Reed,“Analysis of backtracking routing in binary hypercube computers,”Dep. Comput. Sci., Univ. Illinois at Urbana-Champaign, Tech. Rep. UIUCDCS-R-89-1486, Feb. 1989.
[20] C.R. Jesshope,P.R. Miller,, and J.T. Yantchev,“High performance communications in processor networks,” Proc. 16th Ann. Int’l Symp. Comput. Architecture, May-June 1989.
[21] C.K. Kim and D.A. Reed, "Adaptive Packet Routing in a Hypercube," Proc. 3rd Conf. on Hypercube Concurrent Computers&Applications, Jan. 1988.
[22] T.C. Lee and J.P. Hayes, "Routing and Broadcasting in Faulty Hypercube Computers," Proc. Third Conf. Hypercube Concurrent Computers and Applications, pp. 625-630, 1988.
[23] C. T. Liang, S. Bhattacharya, and W. T. Tsai,“Distributed fault tolerant routing in the presence of faults,”inProc. 3rd Symp. Parallel, Distrib. Process., 1991, pp. 474–481.
[24] D. H. Linder and J. C. Harden,“An adaptive and fault tolerant wormhole routing strategy for$k$-ary$n$-cubes,”IEEE Trans. Comput., vol. 40, pp. 2–12, Jan. 1991.
[25] D. S. Reeves, E. F. Gehringer, and A. Chandiramani,“Adaptive routing and deadlock recovery: A simulation study,”inProc. 4th Conf. Hypercube Concurr. Comput. Applic., Mar. 1989.
[26] A. Robles and J. Duato,“Multilinks: A new approach to the design of adaptive routing algorithms for multicomputers,”inProc. IMACS-IFAC Symp. Parallel Distrib. Comput. Eng. Syst., June 1991.

Citation:
Patrick T. Gaughan, Sudhakar Yalamanchili, "A Family of Fault-Tolerant Routing Protocols for Direct Multiprocessor Networks," IEEE Transactions on Parallel and Distributed Systems, vol. 6, no. 5, pp. 482-497, May 1995, doi:10.1109/71.382317
Usage of this product signifies your acceptance of the Terms of Use.