Issue No. 09 - September (1992 vol. 41)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/12.165394
<p>The problem of fault diagnosis in multiprocessor systems is considered under a probabilistic fault model. The focus is on minimizing the number of tests that must be conducted to correctly diagnose the state of every processor in the system with high probability. A diagnosis algorithm that can correctly diagnose these states with probability approaching one in a class of systems performing slightly greater than a linear number of tests is presented. A nearly matching lower bound on the number of tests required to achieve correct diagnosis in arbitrary systems is proved. Lower and upper bounds on the number of tests required for regular systems are presented. A class of regular systems which includes hypercubes is shown to be correctly diagnosable with high probability. In all cases, the number of tests required under this probabilistic model is shown to be significantly less than under a bounded-size fault set model. These results represent a very great improvement in the performance of system-level diagnosis techniques.</p>
multiprocessor systems; fault diagnosis; probabilistic fault model; diagnosis algorithm; lower bound; upper bounds; hypercubes; computational complexity; fault tolerant computing; hypercube networks; multiprocessing systems; parallel algorithms; probability.
D. Blough, G. Sullivan and G. Masson, "Efficient Diagnosis of Multiprocessor Systems Under Probabilistic Models," in IEEE Transactions on Computers, vol. 41, no. , pp. 1126-1136, 1992.