2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA) (1995)
Raleigh, North Carolina
Jan. 22, 1995 to Jan. 25, 1995
E. Brandt , Dept. of Comput. Sci., Harvey Mudd Coll., Claremont, CA, USA
R. Libeskind-Hadas , Dept. of Comput. Sci., Harvey Mudd Coll., Claremont, CA, USA
The ability to tolerate faults is critical in multi-computers employing large numbers of processors. This paper describes a class of fault-tolerant routing algorithms for n-dimensional meshes that can tolerate large numbers of faults without using virtual channels. We show that these routing algorithms prevent livelock and deadlock while remaining highly adaptive.
distributed memory systems; multiprocessor interconnection networks; fault tolerant computing; reliability; concurrency control; message passing; origin-based fault-tolerant routing; fault-tolerant routing algorithms; n-dimensional meshes; virtual channels; livelock; deadlock
E. Brandt, R. Libeskind-Hadas, "Origin-based fault-tolerant routing in the mesh", 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA), vol. 00, no. , pp. 102, 1995, doi:10.1109/HPCA.1995.386551