This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A Graph Partitioning Approach to Sequential Diagnosis
January 1997 (vol. 46 no. 1)
pp. 39-47

Abstract—This paper describes a generalized sequential diagnosis algorithm whose analysis leads to strong diagnosability results for a variety of multiprocessor interconnection topologies. The overall complexity of this algorithm in terms of total testing and syndrome decoding time is linear in the number of edges in the interconnection graph and the total number of iterations of diagnosis and repair needed by the algorithm is bounded by the diameter of the interconnection graph. The degree of diagnosability of this algorithm for a given interconnection graph is shown to be directly related to a graph parameter which we refer to as the partition number. We approximate this graph parameter for several interconnection topologies and thereby obtain lower bounds on degree of diagnosability achieved by our algorithm on these topologies. If we let N denote total number of vertices in the interconnection graph and Δ denote the maximum degree of any vertex in it, then our results may be summarized as follows. We show that a symmetric d-dimensional grid graph is sequentially $\Omega \left( {N^{{{d \over {d+1}}}}} \right)$-diagnosable for any fixed d. For hypercubes, symmeteric log N-dimensional grid graphs, it is shown that our algorithm leads to a surprising $\Omega \left( {{{{N\,{\rm log\,log}\,N} \over {log\,N}}}} \right)$ degree of diagnosability. Next we show that the degree of diagnosability of an arbitrary interconnection graph by our algorithm is $\Omega \left( {\sqrt {{{N \over \Delta }}}} \right).$ This bound translates to an $\Omega \left( {\sqrt N} \right)$ degree of diagnosability for cube-connected cycles and an $\Omega \left( {\sqrt {{{N \over k}}}} \right)$ degree of diagnosability for k-ary trees. Finally, we augment our algorithm with another algorithm to show that every topology is $\Omega \left( {N^{{{1 \over 3}}}} \right)$-diagnosable.

[1] F.P. Preparata, G. Metze, and R.T. Chien, "On the Connection Assignment Problem of Diagnosable Systems," IEEE Trans. Electronic Computers, vol. 16, pp. 848-854, Dec. 1967.
[2] A.D. Friedman and L. Simoncini, "System-Level Fault Diagnosis," Computer, vol. 13, no. 3, pp. 47-53, Mar. 1980.
[3] C. Kime, "System Diagnosis," Fault-Tolerant Computing: Theory and Techniques, D.K. Pradhan, ed., vol. II, chapter 8. Englewood Cliffs, N.J.: Prentice Hall, 1986.
[4] A.K. Somani, V.K. Agarwal, and D. Avis, "A Generalized Theory for System Level Diagnosis," IEEE Trans. Computers, vol. 36, no. 5, pp. 538-546, May 1987.
[5] A.T. Dahbura, "System-Level Diagnosis: A Perspective for the Third Decade," Concurrent Computation: Algorithms, Architectures, Technologies.New York: Plenum, 1988.
[6] S.L. Hakimi and A.T. Amin, "Characterization of Connection Assignment of Diagnosable Systems," IEEE Trans. Computers, vol. 23, no. 1, pp. 86-88, Jan. 1974.
[7] J.R. Armstrong and F.G. Gray, "Fault Diagnosis in a Boolean n-Cube Array of Microprocessors," IEEE Trans. Computers, vol. 30, no. 8, pp. 587-590, Aug. 1981.
[8] A.T. Dahbura and G.M. Masson, "AnO(n2.5) Fault Identification Algorithm for Diagnosable Systems," IEEE Trans. Computers, vol. 33, no. 6, pp. 485-492, June 1984.
[9] S. Huang, J. Xu, and T. Chen, "Characterization and Design of Sequentially t-Diagnosable Systems," Proc. IEEE CS 19th Int'l Symp. Fault-Tolerant Computing, pp. 554-559, 1989.
[10] A. Kavianpour and K.H. Kim, "A Comparative Evaluation of Four Basic System-Level Diagnosis Strategies for Hypercubes," IEEE Trans. Reliability, vol. 41, pp. 26-37, Mar. 1992.
[11] J.G. Kuhl and S.M. Reddy, "Distributed Fault Tolerance for Large Multiprocessor Systems," Proc. 1980 Computer ArchitectureSymp., pp. 222-229, May 1980.
[12] S.H. Hosseini, J.G. Kuhl, and S.M. Reddy, "A Diagnosis Algorithm for Distributed Computing Systems with Dynamic Failure and Repair," IEEE Trans. Computers, vol. 33, no. 3, pp. 223-233, Mar. 1984.
[13] A. Bagchi and S.L. Hakimi, "An Optimal Algorithm for Distributed System Level Diagnosis," Proc. IEEE CS 21st Int'l Symp. Fault-Tolerant Computing, pp. 214-221, 1991.
[14] R. Bianchini Jr. and R. Buskens, "An Adaptive Distributed System-Level Diagnosis Algorithm and Its Implementation," Proc. 21st Int'l Symp. Fault-Tolerant Computing (FTCS-21), pp. 222-229, 1991.
[15] A. Bagchi, "A Distributed Algorithm for System-Level Diagnosis in Hypercubes," Proc. IEEE Workshop Fault-Tolerant Parallel and Distributed Systems, pp. 106-113, July 1992.
[16] D.M. Blough, G.F. Sullivan, and G.M. Masson, "Almost Certain Diagnosis for Intermittently Faulty Systems," Proc. IEEE CS 18th Int'l Symp. Fault-Tolerant Computing, pp. 260-265, 1988.
[17] D.M. Blough and A. Pelc, "Reliable Diagnosis and Repair in Constant-Degree Multiprocessor Systems," Proc. IEEE CS 20th Int'l Symp. Fault-Tolerant Computing, pp. 316-323, 1990.
[18] S. Rangarajan and D. Fussell, "Probabilistic Diagnosis Algorithms Tailored to System Topology," Proc. IEEE CS 21st Int'l Symp. Fault-Tolerant Computing, pp. 230-237, 1991.
[19] A. Ghafoor and P. Sole, "Performance of Fault-Tolerant Diagnostics in the Hypercube Systems," IEEE Trans. Computers, vol. 38, pp. 1,164-1,172, Aug. 1989.
[20] S. Khanna and W.K. Fuchs, "New Algorithms for Sequential Diagnosis," Technical Report CRHC-92-13, Univ. of Illinois, Center for Reliable and High-Performance Computing, Aug. 1992.
[21] F.P. Preparata and J. Vuillemin, “The Cube-Connected Cycles: A Versatile Network for Parallel Computation,” Comm ACM, vol. 24, no. 5, pp. 300-309, 1981.
[22] D.P. Bertsekas and J.N. Tsitsiklis, Parallel and Distributed Computation.Englewood Cliffs, N.J.: Prentice Hall International, 1989.

Index Terms:
Analysis of algorithms, degree of diagnosability, fault-tolerance, graph partitioning, multiprocessor systems, sequential diagnosis, system-level diagnosis.
Citation:
Sanjeev Khanna, W. Kent Fuchs, "A Graph Partitioning Approach to Sequential Diagnosis," IEEE Transactions on Computers, vol. 46, no. 1, pp. 39-47, Jan. 1997, doi:10.1109/12.559801
Usage of this product signifies your acceptance of the Terms of Use.