Subscribe
Issue No.01 - January-February (2011 vol.8)
pp: 74-88
Dong Xiang , Tsinghua University, Beijing
ABSTRACT
A new deadlock-free routing scheme for meshes is proposed based on a new virtual network partitioning scheme, called channel overlapping. Two virtual networks can share some common virtual channels based on the new virtual network partitioning scheme. The deadlock-free adaptive routing method is then extended to deadlock-free adaptive fault-tolerant routing in 3D meshes still with two virtual channels. A few faulty nodes can make a higher dimensional mesh unsafe for fault-tolerant routing methods based on the block fault model, where the whole system (n-dimensional space) forms a fault block. Planar safety information in meshes is proposed to guide fault-tolerant routing and classifies fault-free nodes inside 2D planes. Many nodes globally marked as unsafe in the whole system become locally enabled inside 2D planes. This fault-tolerant deadlock-free adaptive routing algorithm is also extended to the one in an n-dimensional meshes with two virtual channels. Extensive simulation results are presented and compared to previous methods.
INDEX TERMS
 [1] N.R. Adiga et al. "Blue Gene/L Torus Interconnection Network," IBM J. Research and Development, vol. 49, pp. 265-276, Mar.-May 2005. [2] R.V. Boppana and S. Chalasani, "Fault-Tolerant Wormhole Routing Algorithms for Mesh Networks," IEEE Trans. Computers, vol. 44, no. 7, pp. 848-864, July 1995. [3] R. Brightwell, K.T. Pedretti, K.D. Underwood, and T. Hudson, "Seastar Interconnect: Balanced Bandwidth for Scalable Performance," IEEE Micro, vol. 26, no. 3, pp. 41-57, May/June 2006. [4] A.A. Chien and J.H. Kim, "Planar Adaptive Routing: Low-Cost Adaptive Networks for Multiprocessors," J. ACM, vol. 42, no. 1, pp. 91-123, 1995. [5] G.M. Chiu, "The Odd-Even Turn Model for Adaptive Routing," IEEE Trans. Parallel and Distributed Systems, vol. 11, no. 7, pp. 729-738, July 2000. [6] W.J. Dally and G.L. Seitz, "Deadlock-Free Message Routing in Multiprocessor Interconnection Networks," IEEE Trans. Computers, vol. 36, no. 5, pp. 547-553, May 1987. [7] W.J. Dally and H. Aoki, "Deadlock-Free Adaptive Routing Multicomputer Networks Using Virtual Channels," IEEE Trans. Parallel and Distributed Systems, vol. 4, no. 4, pp. 466-475, Apr. 1993. [8] J. Duato, S. Yalamanchili, and L. Ni, Interconnection Networks: An Engineering Approach. IEEE Press, 1997. [9] J. Duato, "A New Theory of Deadlock-Free Adaptive Routing in Wormhole Networks," IEEE Trans. Parallel and Distributed Systems, vol. 4, no. 12, pp. 1320-1331, Dec. 1993. [10] P.T. Gaughan, B.V. Dao, S. Yalamanchili, and D.E. Schimmel, "Distributed, Deadlock-Free Routing in Faulty, Pipelined, Direct Interconnection Networks," IEEE Trans. Computers, vol. 45, no. 6, pp. 651-665, June 1996. [11] C.J. Glass and L. Ni, "Fault-Tolerant Wormhole Routing in Meshes without Virtual Channels," IEEE Trans. Parallel and Distributed Systems, vol. 7, no. 6, pp. 620-636, June 1996. [12] M.E. Gomez, N.A. Nordbotten, J. Flich, P. Lopez, A. Robles, J. Duato, T. Skeie, and O. Lysne, "A Routing Methodology for Achieving Fault Tolerance in Direct Networks," IEEE Trans. Computers, vol. 55, no. 4, pp. 400-415, Apr. 2006. [13] Z. Jiang, J. Wu, and D. Wang, "A New Fault Information Model for Fault-Tolerant Adaptive and Minimal Routing in 3-D Meshes," IEEE Trans. Reliability, vol. 57, no. 1, pp. 149-162, Mar. 2008. [14] D.H. Linder and J.C. Harden, "An Adaptive and Fault-Tolerant Wormhole Routing Strategy for $k$ -Ary $n$ -Cube," IEEE Trans. Computers, vol. 40, no. 1, pp. 2-12, Jan. 1991. [15] H. Matsutani, M. Koibuchi, and H. Amano, "Tightly-Coupled Multi-Layer Topologies for 3-D NoCs," Proc. 36th Int'l Conf. Parallel Processing, Sept. 2007. [16] S.S. Mukerhjee, R. Bannon, S. Lang, and A. Spink, "The Alpha 21364 Network Architecture," IEEE Micro, vol. 22, no. 1, pp. 26-35, Jan./Feb. 2002. [17] N.A. Nordbotten and T. Skeie, "A Routing Methodology for Dynamic Fault-Tolerance in Meshes/Tori," Proc. Int'l Conf. High-Performance Computing, 2007. [18] V. Puente and J.A. Gregorio, "Immucube: Scalable Fault-Tolerant Routing for $k$ -Ary $n$ -Cube Networks," IEEE Trans. Parallel and Distributed Systems, vol. 18, no. 8, pp. 776-788, June 2007. [19] S.L. Scott and G.M. Thorson, "The Cray T3E Network: Adaptive Routing in High Performance 3D Torus," Proc. Int'l Symp. Hot Interconnects, pp. 147-156, 1996. [20] C.C. Su and K.G. Shin, "Adaptive Fault-Tolerant Deadlock-Free Routing in Meshes and Hypercubes," IEEE Trans. Computers, vol. 45, no. 6, pp. 666-683, June 1996. [21] J. Wu, "A Fault-Tolerant and Deadlock-Free Routing Protocol in 2D Meshes Based on Odd-Even Turn Model," IEEE Trans. Computers, vol. 52, no. 9, pp. 1154-1169, Sept. 2003. [22] D. Xiang, "Fault-Tolerant Routing in Hypercube Multicomputers Using Local Safety Information," IEEE Trans. Parallel and Distributed Systems, vol. 12, no. 9, pp. 942-951, Sept. 2001. [23] D. Xiang, J.G. Sun, J. Wu, and K. Thulasiraman, "Fault-Tolerant Routing in Meshes/Tori Using Planarly Constructed Fault Blocks," Proc. 34th Int'l Conf. Parallel Processing, pp. 577-584, 2005. [24] D. Xiang, Y. Zhang, Y. Pan, and J. Wu, "Deadlock-Free Adaptive Routing in Meshes Based on Cost-Effective Deadlock Avoidance Schemes," Proc. 36th Int'l Conf. Parallel Processing, Sept. 2007. [25] D. Xiang, Y. Zhang, and Y. Pan, "Practical Deadlock-Free Fault-Tolerant Routing Based on the Planar Network Fault Model," IEEE Trans. Computers, vol. 58, no. 5, pp. 620-633, May 2009.