This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
On Diagnosability of Large Fault Sets in Regular Topology-Based Computer Systems
August 1996 (vol. 45 no. 8)
pp. 892-903

Abstract—The classical t-diagnosability approach has its limitation when dealing with large fault sets in large multiprocessor systems. This is due to limited diagnosability of large multiprocessor systems connected using regular interconnection structures. We propose an alternative approach to system diagnosis by allowing a few upper bounded number of units to be diagnosed incorrectly. This measure is called t/k-diagnosability. Using this new measure, it is possible to increase the degree of diagnosability of large system considerably. The t/k-diagnosis guarantees that all the faulty units (processors) in a system are detected (provided the number of faulty units does not exceed t) while at most k units are incorrectly diagnosed. We provide necessary and sufficient conditions for t/k-diagnosability and discuss their implication. To demonstrate the power of this approach, we analyze the diagnosability of large systems connected as hypercube, star-graph, and meshes. It is shown that a substantial increase in the degree of diagnosability of these structures is achieved, compared with the degree of diagnosability achieved using the classic t-diagnosability approach, at the cost of a comparably small number of incorrectly diagnosed units.

[1] C. L. Seitz,“The cosmic cube,”CACM, vol. 28, pp. 22–33, Jan. 1985.
[2] M.S. Krishnamoorty and B. Krishnamoorty, "Fault diameter of interconnection networks," Computers and Mathematics with Applications, vol. 13, no. 5-6, pp. 577-582, 1987.
[3] A.K. Somani et al., "Proteus System Architecture and Organization," Proc. Fifth Int'l Parallel Processing Symp., pp. 287-294, June 1991. Also, to appear in Machine Vision and Applications, 1993.
[4] S.B. Choi and A.K. Somani, "The Generalized Folding-Cube Network," NETWORKS, An Int'l J., vol. 21, pp. 267-294, Mar. 1991.
[5] S.B. Choi and A.K. Somani, "Rearrangeable Circuit-Switched Architectures for Routing Permutations," J. Parallel and Distributed Computing, vol. 19, no. 2, pp. 125-130, Oct. 1993.
[6] F.P. Preparata, G. Metze, and R.T. Chien, "On the Connection Assignment Problem of Diagnosable Systems," IEEE Trans. Electronic Computers, vol. 16, pp. 848-854, Dec. 1967.
[7] S.L. Hakimi and A.T. Amin, "Characterization of Connection Assignment of Diagnosable Systems," IEEE Trans. Computers, vol. 23, pp. 86-88, Jan. 1974.
[8] F.J. Allan, T. Kameda, and S. Toida, "An Approach to the Diagnosability Analysis of a System," IEEE Trans. Computers, vol. 24, pp. 1,040-1,042, Oct. 1975.
[9] A.K. Somani, V.K. Agarwal, and D. Avis, "A Generalized Theory for System Level Diagnosis," IEEE Trans. Computers, vol. 36, no. 5, pp. 538-546, May 1987.
[10] A.D. Friedman, "A New Measure of Digital System Diagnosis," Proc. Fifth Int'l Symp. Fault-Tolerant Computing, pp. 167-170, 1975.
[11] S. Karunanithi and A.D. Friedman, "Analysis of Digital Systems Using a New Measure of System Diagnosis," IEEE Trans. Computers, vol. 25, pp. 121-133, Feb. 1979.
[12] K.Y. Chwa and S.L. Hakimi, "On Fault Identification in Diagnosable Systems," IEEE Trans. Computers, vol. 30, no. 6, pp. 414-422, June 1981.
[13] C.L. Yang, G.M. Masson, and R.A. Leoneti, "On Fault Isolation and Identification in t1/t1-Diagnosable Systems," IEEE Trans. Computers, vol. 35, no. 7, pp. 639-643, July 1986.
[14] J.R. Armstrong and F. Gray, "Fault Diagnosis in a Boolean n-Cube Array of Microprocessors," IEEE Trans. Computers, vol. 30, no. 8, pp. 587-590, Aug. 1981.
[15] A.K. Somani and V.K. Agarwal, "Distributed Diagnosis Algorithms for Regular Interconnected Structures," IEEE Trans. Computers, vol. 41, no. 7, pp. 899-906, July 1992.
[16] A. Kavianpour and K.H. Kim, "A Comparative Evaluation of Four Basic System-Level Diagnosis Strategies for Hypercubes," IEEE Trans. Reliability, vol. 41, pp. 26-37, Mar. 1992.
[17] Y. Shibata and T. Yasuda, "Synthesis and Diagnosis Algorithm of Diagnosable Systems on Hypercube Networks," Trans. Inst. of Electronics, Information, and Comm. Engineers D-I, vol. J74D-I, no. 11, pp. 784-787, Nov. 1991.
[18] P. Berman and A. Pelc, "Distributed Probabilistic Fault Diagnosis for Multiprocessor Systems," Proc. 20th Int'l Symp. Fault-Tolerant Computing, pp. 340-346, 1990.
[19] A. Ghafoor, "Partitioning of Even Networks for Improved Diagnosability (Multiprocessor Systems)," IEEE Trans. Reliability, vol. 39, no. 3. pp. 281-286., Aug. 1990.
[20] D.M. Blough and S. Najand, "Fault-Tolerant Multiprocessor System Routing Using Incomplete Diagnostic Information," Proc. Sixth Int'l Parallel Processing Symp., pp. 398-402, Mar. 1992.

Index Terms:
Fault diagnosis, system level diagnosis, t-diagnosable systems, t/k-diagnosable systems, degree of diagnosability, t/k-diagnosability of hypercubes, t/k-diagnosability of star-graph.
Citation:
Arun K. Somani, Ofer Peleg, "On Diagnosability of Large Fault Sets in Regular Topology-Based Computer Systems," IEEE Transactions on Computers, vol. 45, no. 8, pp. 892-903, Aug. 1996, doi:10.1109/12.536232
Usage of this product signifies your acceptance of the Terms of Use.