Issue No. 06 - June (1994 vol. 5)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/71.285608
<p>This paper addresses the distributed self-diagnosis of a multiprocessor/multicomputersystem based on fault syndromes formed by comparison testing. The authors show thatby using multiple fault syndromes, it is possible to achieve significantly better diagnosisthan by using a single fault syndrome, even when the amount of time devoted to testingis the same. They derive a multiple syndrome diagnosis algorithm that in terms of thelevel of diagnostic accuracy achieved, is globally suboptimal, but optimal among alldiagnosis algorithms of a certain type to be defined. The diagnosis algorithm producesgood results, even with sparse interconnection networks and interprocessor tests withlow fault coverage. It is also proven that the diagnosis algorithm produces 100% correctdiagnosis as N, the number of nodes in the system, approaches /spl infin/, provided thatthe interconnection network has connectivity greater than or equal to 2 and that thenumber of syndromes produced grows faster than log N. This solution and anothermultiple syndrome diagnosis solution by Fussell and Rangarajan (1989) are comparatively evaluated, both analytically and with simulations.</p>
Index Termsmultiprocessing systems; probability; fault tolerant computing; performance evaluation;probabilistic diagnosis; multiprocessor systems; multiple syndromes; distributedself-diagnosis; comparison testing; diagnostic accuracy; diagnosis algorithms; sparseinterconnection networks; interprocessor tests; low fault coverage; fault-tolerantcomputing; intermittent fault; multicomputer; multiprocessor; self-test; system-leveldiagnosis
S. Lee and K. Shin, "On Probabilistic Diagnosis of Multiprocessor Systems Using Multiple Syndromes," in IEEE Transactions on Parallel & Distributed Systems, vol. 5, no. , pp. 630-638, 1994.