Issue No. 06 - June (1996 vol. 45)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/12.506422
<p><b>Abstract</b>—This paper focuses on designing high performance pipelined networks that can operate in the presence of dynamic component failures. A general, rigorous framework for deadlock-free communication in faulty, pipelined networks is developed. A mechanism is also proposed for recovering from dynamic link and node failures. The recovery mechanism 1) is fully distributed, 2) does not require timeouts, 3) prevents fault-induced deadlock, and 4) is integrated into the virtual channel flow control mechanisms. This recovery mechanism is used to develop a new pipelined communication mechanism—acknowledged pipelined circuit-switching (APCS). This mechanism supports existing routing protocols [<ref rid="bibt065119" type="bib">19</ref>] that can tolerate a maximal number of static link failures, i.e., one less than the number of ports on a node. An implementation of a novel router architecture is described and the results of detailed flit level simulations are presented. Finally, the proposed recovery mechanism is shown to be applicable to existing adaptive wormhole routing protocols which are prone to deadlock in the presence of dynamic faults.</p>
Dynamic fault tolerance, reliable message delivery, distributed recovery mechanism, pipelined interconnection network, wormhole routing.
Binh V. Dao, Sudhakar Yalamanchili, David E. Schimmel, Patrick T. Gaughan, "Distributed, Deadlock-Free Routing in Faulty, Pipelined, Direct Interconnection Networks", IEEE Transactions on Computers, vol. 45, no. , pp. 651-665, June 1996, doi:10.1109/12.506422