The Community for Technology Leaders
Green Image
<p><it>Abstract</it>—This paper proposes a fault-tolerant distributed subcube management scheme for hypercube multicomputer systems. Gracefully degradable subcube management is supported by a data structure, called the <it>distributed subcube table</it> (DST), and a fault-tolerant broadcast protocol, called the <it>reliably synchronized broadcast</it> (RSB). In an <it>n</it>-dimensional hypercube, DST is the collection of 2<super><it>n</it></super><it>local subcube tables</it> (LSTs), <math><tmath>${\mbi DST = \{LST_0,\,LT_1,\,\dots,\,LST^n_{2-1}\}}$</tmath></math>, where <it>LST</it><sub><it>x</it></sub> is a bit-mapped table assigned to <it>N</it><sub><it>x</it></sub>, a fault-free node whose address is <it>x</it>. <it>LST</it><sub>x</sub>, ∀<it>x</it>, is <it>n</it>+ 1 bits long, and it records the status (free/busy) of certain subcubes adjacent to <it>N</it><sub><it>x</it></sub>. The RSB diagnoses and avoids faults during interprocessor communication to prevent faulty nodes from being allocated for job execution. In addition to possessing a fault-tolerant design, our scheme can also achieve comparable or better performance than existing centralized schemes, as verified by extensive simulation.</p>
Distributed subcube management, fault-tolerance, hypercube multicomputer, reliable broadcast.
Jyh-Charn Liu, Yi-long Chen, "A Fault-Tolerant Distributed Subcube Management Scheme for Hypercube Multicomputer Systems", IEEE Transactions on Parallel & Distributed Systems, vol. 6, no. , pp. 766-772, July 1995, doi:10.1109/71.395406
99 ms
(Ver 3.3 (11022016))