This Article 
 Bibliographic References 
 Add to: 
An Efficient and Deadlock-Free Network Reconfiguration Protocol
June 2008 (vol. 57 no. 6)
pp. 762-779
Component failures and planned component replacements cause changes in the topology and routing paths supplied by the interconnection network of a parallel processor system over time. Such changes may require the network to be reconfigured such that the existing routing function is replaced by one which enables packets to reach their intended destinations amid the changes. Efficient reconfiguration methods are desired that allow the network to function uninterruptedly over the course of the reconfiguration process while remaining free from deadlocking behavior. In this paper, we propose, evaluate, and prove deadlock freedom of a new network reconfiguration protocol that overlaps various phases of "static" reconfiguration processes traditionally used in commercial and research systems to provide performance efficiency on par with that of recently proposed "dynamic" reconfiguration processes, but without their complexity. Simulation results show that the proposed Overlapping Static Reconfiguration protocol can reduce reconfiguration time by up to 50%, reduce packet latency by several orders of magnitude, reduce packet dropping by an order of magnitude, and provide unhalted packet injection as compared to traditional static reconfiguration while allowing network throughput similar to dynamic reconfiguration.

[1] D. Teodosiu, J. Baxter, K. Govil, J. Chapin, M. Rosenblum, and M. Horowitz, “Hardware Fault Containment in Scalable Shared-Memory Multiprocessors,” Proc. 24th Ann. Int'l Symp. Computer Architecture, Computer Architecture News vol. 25, pp. 73-84, 1997.
[2] K. Gharachorloo, M. Sharma, S. Steely, and S. Van Doren, “Architecture and Design of AlphaServer GS320,” ACM SIGPLAN Notices, vol. 35, no. 11, pp. 13-24, Nov. 2000.
[3] “The AlphaServer SC45 Supercomputer: Facts and Figures,” HP SC45 Team, 2002.
[4] W. Barrett et al., “An Overview of the BlueGene/L Supercomputer,” Proc. ACM/IEEE Conf. Supercomputing, Nov. 2002.
[5] M.D. Schroeder et al., “Autonet: A High-Speed, Self-Configuring Local Area Network Using Point-to-Point Links,” SRC Research Report 59, Digital Equipment Corp., 1990.
[6] N.J. Boden, D. Cohen, R.E. Felderman, A.E. Kulawik, C.L. Seitz, J.N. Seizovic, and W.-K. Su, “Myrinet: A Gigabit-Per-Second Local-Area Network,” IEEE Micro, vol. 15, 1995.
[7] D. Garcia and W. Watson, “ServerNet™ II,” Lecture Notes in Computer Science, vol. 1417, pp. 119-135, 1998.
[8] O. Feuser and A. Wenzel, “On the Effects of the IEEE 802.3x Flow Control in Full-Duplex Ethernet LANs,” Proc. 24th IEEE Conf. Local Computer Networks, pp. 160-161, Oct. 1999.
[9] “Guide to Myrinet 2000 Switches and Switch Networks” Myrinet,, Aug. 2001.
[10] InfiniBand Architecture Specification Volume 1 Release 1.0a. InfiniBand Trade Assoc., http:/, 2001.
[11] “Advanced Switching Core Architecture Specification,” ASI-SIG, http:/, 2004.
[12] M. Taylor, J. Kim, J. Miller, D. Wentzlaff, F. Ghodrat, B. Greenwald, H. Hoffman, P. Johnson, J.-W. Lee, W. Lee, A. Ma, A. Saraf, M. Seneski, N. Shnidman, V. Strumpen, M. Frank, S. Amarasinghe, and A. Agarwal, “The Raw Microprocessor: A Computational Fabric for Software Circuits and General-Purpose Programs,” IEEE Micro, vol. 22, no. 2, pp. 25-35, Mar./Apr. 2002.
[13] K. Krewell, “Sun's Niagara Pours on the Cores,” Microprocessor Report, pp. 1-3, Sept. 2004.
[14] J.A. Kahle, M.N. Day, H.P. Hofstee, C.R. Johns, T.R. Maeurer, and D. Shippy, “Introduction to the Cell Multiprocessor,” IBM J. Research and Development, vol. 49, nos. 4/5, 2005.
[15] D. Berger et al., “TRIPS Tutorial: Design and Implementation of the TRIPS EDGE Architecture,” Proc. 32nd Int'l Symp. Computer Architecture, pp. 1-239, June 2005.
[16] M.B. Taylor, W. Lee, S.P. Amarasinghe, and A. Agarwal, “Scalar Operand Networks,” IEEE Trans. Parallel and Distributed Systems, vol. 16, no. 2, pp. 1-18, Feb. 2005.
[17] D. Krolak, “Unleashing the Cell Broadband Engine Processor: The Element Interconnect Bus,” Proc. Fall Processor Forum, librarypa-fpfeib/, Nov. 2005.
[18] P. Gratz, K. Sankaralingam, H. Hanson, P. Shivakumar, R. McDonald, S.W. Keckler, and D. Burger, “Implementation and Evaluation of a Dynamically Routed Processor Operand Network,” Proc. First Network-on-Chip Symp., 2007.
[19] P. Kermani and L. Kleinrock, “Virtual Cut-Through: A New Computer Communication Switching Technique,” Computer Networks, vol. 3, pp. 267-286, 1979.
[20] W.J. Dally and C.L. Seitz, “Deadlock-Free Message Routing in Multiprocessor Interconnection Networks,” IEEE Trans. Computers, vol. 36, no. 5, pp. 547-553, May 1987.
[21] W.J. Dally, “Virtual-Channel Flow Control,” IEEE Trans. Parallel and Distributed Systems, vol. 3, no. 2, pp. 194-205, Mar. 1992.
[22] J. Duato, “A Necessary and Sufficient Condition for Deadlock-Free Adaptive Routing in Wormhole Networks,” IEEE Trans. Parallel and Distributed Systems, vol. 6, no. 10, pp. 1055-1067, Oct. 1995.
[23] J. Duato, “A Necessary and Sufficient Condition for Deadlock-Free Routing in Cut-Through and Store-and-Forward Networks,” IEEE Trans. Parallel and Distributed Systems, vol. 7, no. 8, pp. 841-854, Aug. 1996.
[24] W.J. Dally and B.P. Towles, Principles and Practices of Interconnection Networks. Morgan Kaufmann, 2004.
[25] S. Warnakulasuriya and T.M. Pinkston, “A Formal Model of Message Blocking and Deadlock Resolution in Interconnection Networks,” IEEE Trans. Parallel and Distributed Systems, vol. 11, no. 2, pp. 212-229, Feb. 2000.
[26] M.D. Schroeder, A.D. Birrell, M. Burrows, H. Murray, R.M. Needham, T.L. Rodeheffer, E.H. Satterthwaite, and C.P. Thacker, “Autonet: A High-Speed, Self-Configuring Local Area Network Using Point-to-Point Links,” SRC Research Report 59, Digital Equipment Corp., 1990.
[27] T.L. Rodeheffer and M.D. Schroeder, “Automatic Reconfiguration in Autonet,” Proc. 13th ACM Symp. Operating Systems Principles, pp. 183-197, Oct. 1991.
[28] O. Lysne and J. Duato, “Fast Dynamic Reconfiguration in Irregular Networks,” Proc. 29th Int'l Conf. Parallel Processing, pp.449-458, 2000.
[29] R. Casado, A. Bermúdez, J. Duato, F.J. Quiles, and J.L. Sánchez, “A Protocol for Deadlock-Free Dynamic Reconfiguration in High-Speed Local Area Networks,” IEEE Trans. Parallel and Distributed Systems, vol. 12, no. 2, pp. 115-132, Feb. 2001.
[30] N. Natchev, D. Avresky, and V. Shurbanov, “Dynamic Reconfiguration in High-Speed Computer Clusters,” Proc. Third IEEE Int'l Conf. Cluster Computing, pp. 380-387, 2001.
[31] T. Pinkston, R. Pang, and J. Duato, “Deadlock-Free Dynamic Reconfiguration Schemes for Increased Network Dependability,” IEEE Trans. Parallel and Distributed Systems, vol. 14, no. 8, pp. 780-794, Aug. 2003.
[32] J. Duato, O. Lysne, R. Pang, and T.M. Pinkston, “Part I: A Theory for Deadlock-Free Dynamic Network Reconfiguration,” IEEE Trans. Parallel and Distributed Systems, vol. 16, no. 5, pp. 412-427, May 2005.
[33] O. Lysne, T.M. Pinkston, and J. Duato, “Part II: A Methodology for Developing Deadlock-Free Dynamic Network Reconfiguration Processes,” IEEE Trans. Parallel and Distributed Systems, vol. 16, no. 5, pp. 428-443, May 2005.
[34] D. Avresky and N. Natchev, “Dynamic Reconfiguration in Computer Clusters with Irregular Topologies in the Presence of Multiple Node and Link Failures,” IEEE Trans. Computers, vol. 54, no. 5, May 2005.
[35] J.R. Acosta and D.R. Avresky, “Intelligent Dynamic Network Reconfiguration,” Proc. 21st Int'l Parallel and Distributed Processing Symp., pp. 1-9, 2007.
[36] J.M. Mellor-Crummey and M.L. Scott, “Algorithms for Scalable Synchronization on Shared-Memory Multiprocessors,” ACM Trans. Computer Systems, vol. 9, no. 1, pp. 21-65, 1991.
[37] J. Duato and T.M. Pinkston, “A General Theory for Deadlock-Free Adaptive Routing Using a Mixed Set of Resources,” IEEE Trans. Parallel and Distributed Systems, vol. 12, no. 12, pp. 1219-1235, Dec. 2001.
[38] J.M. Montañana, J. Flich, A. Robles, and J. Duato, “A Scalable Methodology for Computing Fault-Free Paths in Infiniband Torus Networks,” Proc. Sixth Int'l Symp. High-Performance Computing, 2005.
[39] J.M. Montañana, J. Flich, A. Robles, and J. Duato, “Reachability-Based Fault-Tolerant Routing,” Proc. 12th Int'l Conf. Parallel and Distributed Systems, pp. 515-524, 2006.
[40] InfiniBand Architecture Specification, InfiniBand Trade Assoc., 2000.
[41] A. Bermúdez, R. Casado, F.J. Quiles, and T.M. Pinkston, “Evaluation of a Subnet Management Mechanism for InfiniBand Networks,” Proc. 32nd Int'l Conf. Parallel Processing, pp. 117-124, 2003.
[42] F. Angiolini, P. Meloni, S.M. Carta, L. Raffo, and L. Benini, “Layout-Aware Analysis of Networks-on-Chip and Traditional Interconnects for MPSoCs,” IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, vol. 26, no. 3, pp. 421-434, 2007.
[43] F. Poletti, A. Poggiali, D. Bertozzi, L. Benini, P. Marchal, M. Loghi, and M. Poncino, “Energy-Efficient Multiprocessor Systems-on-Chip for Embedded Computing: Exploring Programming Models and Their Architectural Support,” IEEE Trans. Computers, vol. 56, no. 5, pp. 606-621, May 2007.

Index Terms:
Interconnections (Subsystems), I/O and Data Communications, Topology
Olav Lysne, Jose Miguel Montanana, Jose Flich, Jose Duato, Timothy Mark Pinkston, Tor Skeie, "An Efficient and Deadlock-Free Network Reconfiguration Protocol," IEEE Transactions on Computers, vol. 57, no. 6, pp. 762-779, June 2008, doi:10.1109/TC.2008.31
Usage of this product signifies your acceptance of the Terms of Use.