This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Compressionless Routing: A Framework for Adaptive and Fault-Tolerant Routing
March 1997 (vol. 8 no. 3)
pp. 229-244

AbstractCompressionless Routing (CR) is a new adaptive routing framework which provides a unified framework for efficient deadlock-free adaptive routing and fault-tolerance. CR exploits the tight-coupling between wormhole routers for flow control to detect and recover from potential deadlock situations. Fault-tolerant Compressionless Routing (FCR) extends CR to support end-to-end fault-tolerant delivery. Detailed routing algorithms, implementation complexity, and performance simulation results for CR and FCR are presented. These results show that the hardware for CR and FCR networks is modest. Further, CR and FCR networks can achieve superior performance to alternatives such as dimension-order routing.

Compressionless Routing has several key advantages: deadlock-free adaptive routing in toroidal networks with no virtual channels, simple router designs, order-preserving message transmission, applicability to a wide variety of network topologies, and elimination of the need for buffer allocation messages. Fault-tolerant Compressionless Routing has several additional advantages: data integrity in the presence of transient faults (nonstop fault-tolerance), permanent faults tolerance, and elimination of the need for software buffering and retry for reliability. The advantages of CR and FCR not only simplify hardware support for adaptive routing and fault-tolerance, they also can simplify software communication layers.

[1] W.J. Dally and C.L. Seitz, “Deadlock-Free Message Routing in Multiprocessor Interconnection Networks,” IEEE Trans. Computers, Vol. C-36, No. 5, May 1987, pp. 547-553.
[2] D.H. Linder and J.C. Harden, "An Adaptive and Fault Tolerant Wormhole Routing Strategy for k-Ary n-Cubes," IEEE Trans. Computers, vol. 40, no. 1, pp. 2-12, Jan. 1991.
[3] A.A. Chien and J.H. Kim, "Planar-Adaptive Routing: Low-Cost Adaptive Networks for Multiprocessors," Proc. 19th Int'l Symp. Computer Architecture, vol. 20, no. 2, pp. 268-277, May 1992.
[4] W.J. Dally and H. Aoki, "Deadlock-Free Adaptive Routing in Multicomputer Networks Using Virtual Channels," IEEE Trans. Parallel and Distributed Systems, vol. 4, no. 4, pp. 466-475, Apr. 1993.
[5] J. Duato, “On the Design of Deadlock-Free Adaptive Routing Algorithms for Multicomputers: Design Methodologies,” Proc. Parallel Architectures and Languages Europe 91, June 1991.
[6] P. Berman, L. Gravano, J. Sanz, and G. Pifarre, "Adaptive Deadlock- and Livelock-Free Routing with All Minimal Paths in Torus Networks," Proc. Fourth ACM Symp. Parallel Algorithms and Architectures, June 1992.
[7] A.A. Chien, "A Cost and Performance Model for k-Ary n-Cube Wormhole Routers," Proc. Hot Interconnects Workshop, Aug. 1993.
[8] Intel Corporation, Paragon XP/S Product Overview, 1991.
[9] NCUBE, NCUBE 2 6400 Series Supercomputer: Technical Overview,Beaverton, Ore., 1989.
[10] G. Alverson, R. Alverson, D. Callahan, B. Koblenz, A. Porterfield, and B. Smith, "Exploiting Heterogeneous Parallelism on a Multithreaded Multiprocessor," Proc. Sixth ACM Int'l Conf. Supercomputing, 1992.
[11] Cray Research, Inc., CRAY T3D Software Overview Technical Note,Eagan, Minn., 1992.
[12] W.J. Dally, A.A. Chien, S. Fiske, W. Horwat, J. Keen, M. Larivee, R. Lethin, P. Nuth, S. Wills, P. Carrick, and G. Fyler, "The J-Machine: A Fine-Grain Concurrent Computer," Information Processing 89, Proc. IFIP Congress, pp. 1,147-1,153, Aug. 1989.
[13] A. Agarwal, D. Chaiken, G. D'Souza, K. Johnson, D. Kranz, J. Kubiatowicz, K. Kurihara, B.-H. Lim, G. Maa, D. Nussbaum, M. Parkin, and D. Yeung, “The MIT Alewife Machine: A Large-Scale Distributed-Memory Multiprocessor,” Proc. Workshop Scalable Shared Memory Multiprocessors, 1991, (also appears as MIT/LCS Memo TM-454, 1991).
[14] D. Lenoski et al., “The Stanford DASH Multiprocessor,” Computer, pp. 63-79, Mar. 1992.
[15] C.E. Leiserson,Z.S. Abuhamdeh,D.C. Douglas,C.R. Feynman,M.N. Ganmuki,J.V. Hill,W.D. Hillis,B.C. Kuszmaul,M.A. St. Pierre,D.S. Wells,M.C. Wong,S.-W. Yang,, and R. Zak,“The network architecture of the connection machine CM-5,” Proc. Fourth Ann. Symp. Parallel Algorithms and Architectures, ACM, pp. 272-285, June 1992.
[16] Meiko World Inc., Meiko Computing Surface Communications Processor Overview, 1993.
[17] Kendall Square Research, KSR Technical Summary,Waltham, Mass., 1992.
[18] P. Kermani and L. Kleinrock, "Virtual Cut-Through: A New Computer Communications Switching Technique," Computer Networks, vol. 3, no. 4, pp. 267-286, 1979.
[19] C.J. Glass and L.M. Ni, "The Turn Model for Adaptive Routing," Proc. 19th Int'l Symp. Computer Architecture, vol. 20, no. 2, pp. 278-287, May 1992.
[20] J.M. Gordon and Q.F. Stout, “Hypercube Message Routing in the Presence of Faults,” Proc. Third Conf. Hypercube Concurrent Computers and Applications, pp. 318-327, Jan. 1988.
[21] E. Chow, H.S. Madan, J.C. Peterson, D. Grunwald, and D. Reed, "Hyperswitch Network for the Hypercube Computer," Proc. 15th Ann. Symp. Computer Architecture, 1988.
[22] M.-S. Chen and K.G. Shin, "Adaptive Fault-Tolerant Routing in Hypercube Multicomputers," IEEE Trans. Computers, vol. 39, no. 12, pp. 1,406-1,416, Dec. 1990.
[23] P.T. Gaughan and S. Yalamanchili, "Pipelined Circuit-Switching: A Fault-Tolerant Variant of Wormhole Routing," Proc. IEEE Symp. Parallel and Distributed Processing, Dec. 1992.
[24] C.J. Glass and L.M. Ni, "Fault-Tolerant Wormhole Routing in Meshes," Proc. 23rd Int'l Symp. Fault-Tolerant Computing, pp. 240-249, 1993.
[25] Thinking Machines Corp., The Connection Machine CM-5 Technical Summary,Cambridge, Mass., Oct. 1991.
[26] M. Snir, "The Vulcan Project," oral presentation, Feb. 1992.
[27] R.M. Metcalfe and D.R. Boggs, “Ethernet: Distributed Packet Switching for Local Computer Networks,” Comm. ACM, vol. 19, pp. 395–404, 1976.
[28] W.J. Dally and C. Seitz, "The Torus Routing Chip," Distributed Computing, pp. 187-196, 1986.
[29] W.J. Dally, "Virtual-Channel Flow Control," IEEE Trans. Parallel and Distributed Systems, vol. 3, no. 2, pp. 194-205, Mar. 1992.
[30] S. Borkar, R. Cohn, G. Cox, T. Gross, H.T. Kung, M. Lam, M. Levine, B. Moore, W. Moore, C. Peterson, J. Susman, J. Sutton, J. Urbanski, and J. Webb, "Supporting Systolic and Memory Communication in iWarp," Proc. 17th Int'l Symp. Computer Architecture, pp. 70-81, 1990.
[31] L. Widigen, E. Sowadsky, and K. McGrath, "Eliminating Operand Read Latency," Computer Architecture News, Dec. 1996, pp. 18-22.
[32] J.H. Kim and A.A. Chien, "Network Performance Under Bimodal Traffic Loads," J. Parallel and Distributed Computing, vol. 28, pp. 43-64, July 1995.
[33] D. Reeves, E. Gehringer, and A. Chandiramani, "Adaptive Routing and Deadlock Recovery: A Simulation Study," Proc. Fourth Conf. Hypercube Concurrent Computers and Applications, 1989.
[34] BBN Advanced Computers Inc., Butterfly Products Overview, Oct. 1987.
[35] T.F. Knight, "Technologies for Low Latency Interconnection Switches," Proc. ACM Symp. Parallel Algorithms and Architectures, 1989.
[36] C.F. Joerg, "Design and Implementation of a Packet Switched Routing Chip," master's thesis, Massachusetts Inst. of Tech nology, 1990. MIT/LCS/TR-482.

Index Terms:
Routing networks, adaptive routing, deadlock prevention, fault tolerance, wormhole routing.
Citation:
Jae H. Kim, Ziqiang Liu, Andrew A. Chien, "Compressionless Routing: A Framework for Adaptive and Fault-Tolerant Routing," IEEE Transactions on Parallel and Distributed Systems, vol. 8, no. 3, pp. 229-244, March 1997, doi:10.1109/71.584089
Usage of this product signifies your acceptance of the Terms of Use.