This Article 
 Bibliographic References 
 Add to: 
Efficient Algorithms for System Diagnosis with Both Processor and Comparator Faults
April 1993 (vol. 4 no. 4)
pp. 371-381

For the comparison-based self-diagnosis of multiprocessor systems, an extended modelthat considers both processor and comparator faults is presented. It is shown that in thismodel the system diagnosability is t>or=Z delta /2Z, where delta is the minimum vertexdegree of the system graph. However, if the number of faulty comparators is assumednot to exceed the number of faulty processors, the diagnosability of the model reachest>or= delta . An optimal O( mod E mod ) algorithm, where E is the set of comparators, isgiven for identifying all faulty processors and comparators, provided that the total number of faulty components does not exceed the system diagnosability, and an O(mod E mod)/sup 2/ algorithm for the case t>or= delta is also presented. These efficient algorithms determine the faulty processors by calculating each processor's weight, which is mainly defined by the number of adjacent relative tests stating 'agreement'. After sorting the processors according to their weights, the algorithms determine all faulty components by separating the sorted processor list.

[1] F. P. Preparata, G. Metze, and R. T. Chien, "On the connection assignment problem of diagnosable systems,"IEEE Trans. Comput., vol. EC-16, pp. 848-854, Dec. 1967.
[2] M. Malek, "A comparison connection assignment for diagnosis of multiprocessor systems," inProc. 7th Symp. Comput. Architecture, May 1980, pp. 31-35.
[3] S. L. Hakimi and K.-Y. Chwa, "Schemes for fault tolerant computing: A comparison of modularity redundant and t-diagnosable systems,"Inform. Contr., vol. 49, pp. 212-238, June 1981.
[4] E. Ammann and M. Dal Cin, "Efficient algorithms for comparison-based self-diagnosis," inProc. Self-Diagnosis and Fault Tolerance, 1982, pp. 1-18.
[5] C. -L. Yang and G. M. Masson, "An efficient algorithm for multiprocessor fault diagnosis using the comparison approach," inDig. Papers 16th Int. Symp. Fault-Tolerant Comput., 1986, pp. 238-243.
[6] A. Dahbura, K. K. Sabnani, and L. L. King, "The comparison approach to multiprocessors fault diagnosis,"IEEE Trans. Comput., vol. C-36, pp. 373-378, Mar. 1987.
[7] J. Maeng and M. Malek, "A comparison connection assignment for diagnosis of multiprocessor systems," inDig. Papers 11th Int. Symp. Fault-Tolerant Comput., 1981, pp. 173-175.
[8] A. Sengupta and A. T. Dahbura, "On self-diagnosable multiprocessor systems: diagnosis by comparison approach," inDig. Papers 19th Int. Symp. Fault-Tolerant Comput., 1989, pp. 54-61.
[9] K. Echtle, "Fault diagnosis by combination of absolute and relative tests," inProc. 1st Euro. Workshop Dependable Comput., Mar. 1989.
[10] M. Dal Cin, "On distributed system-level self-diagnosis," inProc. 4th Int. Conf. Fault-Tolerant Comput., Informatik Fachberichte214, Springer-Verlag, Heidelberg, Sept. 1989, pp. 186-196.
[11] Y. Chen and T. Chen, "Implementing fault-tolerance via modular redundancy with comparison,"IEEE Trans. Reliability, vol. 39, no. 2, pp. 217-225, June 1990.
[12] A. K. Somani, V. K. Agrawal, and D. Avis, "A generalized theory for system level diagnosis,"IEEE Trans. Comput., vol. C-36, pp. 538-546, May 1987.
[13] P. T. DeSousa and F. P. Mathur, "Sift-out modular redundancy,"IEEE Trans. Comput., C-27, pp. 624-627, July 1978.
[14] H. Ihara, K. Fukuoka, Y. Kubo, and S. Yokota, "Fault-tolerant computer system with three symmetric computers,"Proc. IEEE, vol. 66, pp. 1160-1177, Oct. 1978.
[15] J. Losq, "A high efficient redundancy scheme: Self-purging redundancy,"IEEE Trans. Comput., vol. C-25, pp. 569-578, June 1976.
[16] E. Ammann, R. Brause, M. Dal Cin, E. Dilger, J. Lutz, and T. Risse, "ATTEMPTO: A fault-tolerant multiprocessor working station; design and concepts," inDig. Papers 13th Int. Symp. Fault-Tolerant Comput., 1983, pp. 10-13.

Index Terms:
Index Termsprocessor faults; system diagnosis; comparison-based self-diagnosis; multiprocessorsystems; comparator faults; O( mod E mod )/sup 2/ algorithm; computational complexity;fault tolerant computing; multiprocessing systems
Y. Chen, W. Bücken, K. Echtle, "Efficient Algorithms for System Diagnosis with Both Processor and Comparator Faults," IEEE Transactions on Parallel and Distributed Systems, vol. 4, no. 4, pp. 371-381, April 1993, doi:10.1109/71.219748
Usage of this product signifies your acceptance of the Terms of Use.