This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Fault-Tolerant Routing in Hypercube Multicomputers Using Local Safety Information
September 2001 (vol. 12 no. 9)
pp. 942-951

Abstract—This paper studies fault-tolerant routing for injured hypercubes using local safety information. It is shown that a minimum feasible path is always available if the spanning subcube that contains both source and destination is safe. The safety information outside the spanning subcube is applied only when derouting is needed. A routing scheme based on local safety information is proposed and the extra cost to obtain local safety information is comparable to the one based on global safety information. The proposed algorithm guarantees to find a minimum feasible path if the spanning subcube is contained in a maximal safe subcube and the source is locally safe in the maximal safe subcube. A new technique to set up a partial path is proposed based on local safety information when the above conditions are not met. Sufficient simulation results are provided to demonstrate the effectiveness of the method by comparing with the previous methods.

[1] M.S. Chen and K.G. Shin, “Depth-First Search Approach for Fault-Tolerant Routing in Hypercube Multicomputers,” IEEE Trans. Parallel and Distributed Systems, vol. 1, no. 2, pp. 152-159, Feb. 1990.
[2] A.A. Chien and J.H. Kim, “Planar-Adaptive Routing: Low-Cost Adaptive Networks for Multiprocessors,” Proc. Int'l Symp. Computer Architecture, pp. 268-277, Aug. 1992.
[3] G.M. Chiu and K.S. Chen, “Use of Routing Capability for Fault-Tolerant Routing in Hypercube Multicomputers,” IEEE Trans. Computers, vol. 46, no. 8, pp. 953-958, Aug. 1997.
[4] G.M. Chiu and P.S. Wu, “A Fault-Tolerant Routing Strategy in Hypercube Multicomputers,” IEEE Trans. Computers, vol. 45, no. 2, pp. 143-155, Feb. 1996.
[5] J. Duato, ”A Theory of Fault-Tolerant Routing in Wormhole Networks,” IEEE Trans. Parallel and Distributed Systems, vol. 8, no. 8, pp. 790-802, Aug. 1997.
[6] J. Duato and M.P. Malumbres, “Optimal Topology for Distributed Shared-Memory Multiprocessors: Hypercubes Again?” Proc. Second Int'l Euro-Par, European Conf. Parallel Computing, pp. 205-212, 1996.
[7] P.T. Gaughan and S. Yalamanchili, “A Family of Fault-Tolerant Routing Protocols for Direct Multiprocessor Networks,” IEEE Trans. Parallel and Distributed Systems, vol. 6, no. 5, pp. 482-497, May 1995.
[8] P.T. Gaughan, B.V. Dao, S. Yalamanchili, and D.E. Schimmel, “Distributed, Deadlock-Free Routing in Faulty, Pipelined, Direct Interconnection Networks,” IEEE Trans. Computers, vol. 45, no. 6, pp. 651-665, June 1996.
[9] J.M. Gordan and Q.F. Stout, “Hypercube Message Routing in the Presence of Faults,” Proc. Third Conf. Hypercube Concurrent Computers and Applications, pp. 251-263, 1988.
[10] J. Laudon and D. Lenoski, “The SGI Origin: A ccNUMA Highly Scalable Server,” Proc. Int'l Symp. Computer Architecture, pp. 241-251, 1997.
[11] T.C. Lee and J.P. Hayes, “A Fault-Tolerant Communication Scheme for Hypercube Computers,” IEEE Trans. Computers, vol. 41, no. 10, pp. 1242-1256, Oct. 1992.
[12] L.M. Ni and P.K. McKinley, “A Survey of Wormhole Routing Techniques in Direct Networks,” Computer, vol. 26, no. 2, pp. 62-76, 1993.
[13] M. Ould-Khaoua, “On Optimal Network for Multicomputers: Torus or Hypercube?” Proc. Fourth Int'l Euro-Par, European Conf. Parallel Computing, pp. 989-992, 1998.
[14] M. Peercy and P. Banerjee, “Distributed Algorithms for Shortest Path, Deadlock-Free Routing and Broadcasting in Arbitrarily Faulty Hypercubes,” Proc. IEEE Fault-Tolerant Computing Symp., pp. 218-225, 1990.
[15] C.S. Raghavendra, P.J. Yang, and S.B. Tien, “Free Dimensions-An Effective Approach to Achieving Fault Tolerance in Hypercubes,” IEEE Trans. Computers, vol. 44, no. 9, pp. 1152-1157, Sept. 1995.
[16] Y. Saad and M.H. Schultz, “Topological Properties of Hypercubes,” IEEE Trans. Computers, vol. 37, no. 7, pp. 867-872, July 1988.
[17] J. Wu and E.B. Fernandez, “Reliable Broadcasting in Faulty Hypercube Computers,” Microprocessing and Microprogramming, vol. 46, pp. 241-247, 1993.
[18] J. Wu, “Reliable Unicasting in Faulty Hypercubes Using Safety Levels,” IEEE Trans. Computers, vol. 46, no. 2, pp. 241-247, Feb. 1997.
[19] J. Wu, “Adaptive Fault-Tolerant Routing in Cube-Based Multicomputers Using Safety Vectors,” IEEE Trans. Parallel and Distributed Systems, vol. 9, no. 4, pp. 321-334, Apr. 1998.
[20] D. Xiang and J. Wu, “Fault-Tolerant Multicasting in Hypercube Multicomputers Using Local Safety Information,” Proc. 13th Int'l Conf. Parallel and Distributed Computing Systems, pp. 529-534, Aug. 2000.

Index Terms:
Fault-tolerant routing, hypercube multicomputer, local safety, maximal safe subcube, safe node, spanning subcube, unsafe node.
Citation:
Dong Xiang, "Fault-Tolerant Routing in Hypercube Multicomputers Using Local Safety Information," IEEE Transactions on Parallel and Distributed Systems, vol. 12, no. 9, pp. 942-951, Sept. 2001, doi:10.1109/TPDS.2001.10002
Usage of this product signifies your acceptance of the Terms of Use.