This Article 
 Bibliographic References 
 Add to: 
Dynamic Fault Tolerance in Fat Trees
April 2011 (vol. 60 no. 4)
pp. 508-525
Frank Olaf Sem-Jacobsen, Simula Research Laboratory , Lysaker
Tor Skeie, Simula Research Laboratory, Lysaker
Olav Lysne, Simula Research Laboratory, Lysaker
José Duato, Simula Research Laboratory, Lysaker
Fat trees are a very common communication architecture in current large-scale parallel computers. The probability of failure in these systems increases with the number of components. We present a routing method for deterministically and adaptively routed fat trees, applicable to both distributed and source routing, that is able to handle several concurrent faults and that transparently returns to the original routing strategy once the faulty components have recovered. The method is local and dynamic, completely masking the fault from the rest of the system. It only requires a small extra functionality in the switches to handle rerouting packets around a fault. The method guarantees connectedness and deadlock and livelock freedom for up to k -1 benign simultaneous switch and/or link faults where k is half the number of ports in the switches. Our simulation experiments show a graceful degradation of performance as more faults occur. Furthermore, we demonstrate that for most fault combinations, our method will even be able to handle significantly more faults beyond the k -1 limit with high probability.

[1] C. Clos, "A Study of Non-Blocking Switching Networks," Bell System Technical J., vol. 32, pp. 406-424, 1953.
[2] C.E. Leiserson, "Fat-Trees: Universal Networks for Hardware-Efficient Supercomputing," IEEE Trans. Computers, vol. 34, no. 10, pp. 892-901, Oct. 1985.
[3] I.T. Association, InfiniBand Architecture, Specification Volume 1, Release 1.0a, http:/, 2001.
[4] F. Petrini, W.-C. Feng, A. Hoisie, S. Coll, and E. Frachtenberg, "The Quadrics Network: High-Performance Clustering Technology," IEEE Micro, vol. 22, no. 1, pp. 46-57, Jan./Feb. 2002.
[5] "Top 500 Supercomputer Sites," http:/, 2005.
[6] R. Ponnusamy, A. Choudhary, and G. Fox, "Communication Overhead on the CM5: An Experimental Performance Evaluation," Proc. Fourth Symp. Frontiers of Massively Parallel Computation, pp. 108-115, 1992.
[7] M. Woodacre, D. Robb, D. Roe, and K. Feind, "The SGI Altix TM 3000 Global Shared-Memory Architecture," SGI HPC White Papers, 2003.
[8] F.O. Sem-Jacobsen, O. Lysne, and T. Skeie, "Combining Source Routing and Dynamic Fault Tolerance," Proc. 18th Int'l Symp. Computer Architecture and High Performance Computing (SBAC-PAD), A.F. De Souza, R. Buyya, and W. Meira, Jr., eds., pp. 151-158, 2006.
[9] F. Petrini and M. Vanneschi, "K-ary N-Trees: High Performance Networks for Massively Parallel Architectures," Proc. 11th Int'l Symp. Parallel Processing (IPPS '97), p. 87, 1997.
[10] F.O. Sem-Jacobsen, "Towards a Unified Interconnect Architecture: Combining Dynamic Fault Tolerance with Quality of Service, Community Separation, and Power Saving," PhD dissertation, 2008.
[11] I.D. Scherson and C.-K. Chien, "Least Common Ancestor Networks," Proc. Int'l Parallel Processing Symp., pp. 507-513, , 1993.
[12] M. Valerio, L.E. Moser, and P.M. Melliar-Smith, "Fault-Tolerant Orthogonal Fat-Trees as Interconnection Networks," Proc. First Int'l Conf. Algorithms and Architectures for Parallel Processing, vol. 2, pp. 749-754, 1995.
[13] T. Skeie, "A Fault-Tolerant Method for Wormhole Multistage Networks," Proc. Int'l Conf. Parallel and Distributed Processing Techniques and Applications (PDPTA '98), pp. 637-644, 1998.
[14] T.-H. Lee and J.-J. Chou, "Some Directed Graph Theorems for Testing the Dynamic Full Access Property of Multistage Interconnection Networks," Proc. IEEE Region 10 Conf. Computer, Comm., Control and Power Eng. (TENCON), 1993.
[15] S. Chalasani, C.S. Raghavendra, and A. Varma, "Fault-Tolerant Routing in MIN Based Supercomputers," Proc. Supercomputing, pp. 244-253, 1990.
[16] N.K. Sharma, "Fault-Tolerance of a MIN Using Hybrid Redundancy," Proc. Ann. Simulation Symp., pp. 142-149, Apr. 1994.
[17] M. Valerio, L.E. Moser, and P.M.M. Smith, "Recursively Scalable Fat-Trees As Interconnection Networks," Proc. 13th IEEE Ann. Int'l Phoenix Conf. Computers and Comm., 1994.
[18] Y. Mun and H.Y. Youn, "On Performance Evaluation of Fault-Tolerant Multistage Interconnection Networks," Proc. ACM Symp. Applied Computing, pp. 1-10, 1992.
[19] J. Sengupta and P.K. Bansal, "High Speed Dynamic Fault-Tolerance," Proc. IEEE Region 10 Int'l Conf. Electrical and Electronic Technology (TENCON '10), vol. 2, pp. 669-675, 2001.
[20] J. Sengupta and P.K. Bansal, "Fault-Tolerant Routing in Irregular MINs," Proc. IEEE Region 10 Int'l Conf. Global Connectivity in Energy, Computer, Comm., and Control, vol. 2, pp. 638-641, 1998.
[21] N.-F. Tzeng, P.-C. Yew, and C.-Q. Zhu, "A Fault-Tolerant Scheme for Multistage Interconnection Networks," Proc. Ann. Int'l Symp. Computer Architecture, pp. 368-375, 1985.
[22] F.O. Sem-Jacobsen, T. Skeie, O. Lysne, O. Tørudbakken, E. Rongved, and B.R. Johnsen, "Siamese-Twin: A Dynamically Fault-Tolerant Fat Tree," Proc. Int'l Parallel and Distributed Processing Symp. (IPDPS), 2005.
[23] I.T. Theiss and O. Lysne, "FRoots, A Fault Tolerant and Topology Agnostic Routing Technique," IEEE Trans. Parallel and Distributed Systems, 2005.
[24] N. Mysore et al., "PortLand: A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric," ACM SIGCOMM Computer Comm. Rev., vol. 39, no. 4, pp. 39-50, http://portal.acm.orgcitation. cfm?id=1594977.1592575 , 2009.
[25] A. Greenberg, J. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. Maltz, P. Patel, and S. Sengupta, "VL2: A Scalable Flexible Data Center Network," ACM SIGCOMM Computer Comm. Rev., vol. 39, no. 4, pp. 51-62, http://portal.acm.orgcitation.cfm?id=1594977. 1592576 , 2009.
[26] M. Al-Fares, A. Loukissas, and A. Vahdat, "A Scalable Commodity Data Center Network Architecture," Proc. ACM SIGCOMM 2008 Conf. Data Comm. (SIGCOMM '08), pp. 63-74, http://portal. acm.orgcitation.cfm?doid=1402958.1402967 , 2008.
[27] F.O. Sem-Jacobsen, T. Skeie, and J. Duato, "Dynamic Fault Tolerance in Fat-Trees," Research Note, Simula, 2009.
[28] H.-Y. Tyan, "Design, Realization and Evaluation of a Component-Based Compositional Software Architecture for Network Simulation," PhD dissertation, 2002.
[29] C. Gomez, F. Gilabert, M.E. Gomez, P. Lopez, and J. Duato, "Deterministic versus Adaptive Routing in Fat-Trees," Proc. Workshop Comm. Architecture on Clusters, As a Part of IEEE Int'l Parallel and Distributed Processing Symp. (IPDPS '07), 2007.
[30] A. Bermudez, R. Casado, F.J. Quiles, and J. Duato, "Handling Topology Changes in InfiniBand," IEEE Trans. Parallel and Distributed Systems, vol. 18, no. 2, pp. 172-185, Feb. 2007.

Index Terms:
Fat trees, k-ary n-trees, dynamic fault tolerance, deterministic routing, adaptive routing.
Frank Olaf Sem-Jacobsen, Tor Skeie, Olav Lysne, José Duato, "Dynamic Fault Tolerance in Fat Trees," IEEE Transactions on Computers, vol. 60, no. 4, pp. 508-525, April 2011, doi:10.1109/TC.2010.97
Usage of this product signifies your acceptance of the Terms of Use.