This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Correct and Almost Complete Diagnosis of Processor Grids
October 2001 (vol. 50 no. 10)
pp. 1095-1102

Abstract—A new diagnosis algorithm for square grids is introduced. The algorithm always provides correct diagnosis if the number of faulty processors is below $T$, a bound with $T\in\Theta(n^{2/3})$, which was derived by worst-case analysis. A more effective tool to validate the diagnosis correctness is the syndrome dependent bound $T_\sigma$, with $T_\sigma\geq T$, asserted by the diagnosis algorithm itself for every given diagnosis experiment. Simulation studies provided evidence that the diagnosis is complete or almost complete if the number of faults is below $T$. The fraction of units which cannot be identified as either faulty or nonfaulty remains relatively small as long as the number of faults is below $n/3$ and, as long as the number of faults is below $n/2$, the diagnosis is correct with high probability.

[1] L. Baldelli and P. Maestrini, “Self-Diagnosis of Array Processors,” Proc. Fault-Tolerant Computing Symp. (FTCS-24), pp. 48-53, 1994.
[2] M.A. Barborak, M. Malek, and A.T. Dahbura, "The Consensus Problem in Fault-Tolerant Computing," ACM Computer Surveys, vol. 25, pp. 171-220, June 1993.
[3] F. Barsi, F. Grandoni, and P. Maestrini, “A Theory of Diagnosability of Digital Systems,” IEEE Trans. Computers, vol. 25, no. 6, pp. 585-593, June 1976.
[4] D.M. Blough, G.F. Sullivan, and G.M. Masson, "Efficient Diagnosis of Multiprocessor Systems Under Probabilistic Models," IEEE Trans. Computers, vol. 41, pp. 1,126-1,136, 1992.
[5] S. Chessa, “Self-Diagnosis of Grid-Interconnected Systems, with Application to Self-Test of VLSI Wafers,” PhD thesis, Dept. of Computer Science, TD-2/99, Univ. of Pisa, Italy, Mar. 1999.
[6] S. Chessa and P. Maestrini, “Self-Test of Integrated Circuit Wafers,” Proc. European Test Workshop (ETW 96), pp. 54-58, June 1996.
[7] S. Chessa and P. Maestrini, “Correct and Almost Complete Diagnosis of Processors Grids,” Technical Report 1999-B4.13-5, Istituto di Elaborazione dell'Informazione del CNR, Pisa, Italia, May 1999.
[8] A.T. Dahbura and G.M. Masson, “An$O(n^{2.5})$Fault Identification Algorithm for Diagnosable Systems,” IEEE Trans. Computers, vol. 33, no. 6, pp. 486-492, June 1984.
[9] S.L. Hakimi and S.L. Amin, “Characterization of Connection Assignment of Diagnosable Systems,” IEEE Trans. Computers, vol. 23, no. 1, pp. 86-88, Jan. 1974.
[10] K. Huang, V.K. Agarwal, L. LaForge, and K. Thulasiraman, “A Diagnosis Algorithm for Constant Degree Structures and Its Application to VLSI Circuit Testing,” IEEE Trans. Parallel and Distributed Systems, vol. 6, no. 4, pp. 363-372, Apr. 1995.
[11] L.E. LaForge, K. Huang, and V.K. Agarwal, "Almost Sure Diagnosis of Almost Every Good Element," IEEE Trans. Computers, vol. 43, pp. 295-305, 1994.
[12] A.M. Law and W.D. Kelton, Simulation Modeling and Analysis. New York: McGraw-Hill, 1982.
[13] J. Maeng and M. Malek, “A Comparison Connection Assignment for Self-Diagnosis of Multicomputer Systems,” Proc. Fault-Tolerant Computing Symp. (FTCS-11), pp. 173-175, June 1981.
[14] P. Maestrini and P. Santi, “Self-Diagnosis of Processor Arrays Using a Comparison Model,” Proc. Symp. Reliable and Distributed Systems (SRDS-14), pp. 218-228, Sept. 1995.
[15] F.P. Preparata, G. Metze, and R.T. Chien, “On the Connection Assignment Problem of Diagnosable Systems,” IEEE Trans. Computers, vol. 16, pp. 848-854, Dec. 1967.
[16] S. Rangarajan and D. Fussell, "A Probabilistic Method for Fault Diagnosis of Multiprocessor Systems," Digest of Papers 18th Int'l Symp. Fault-Tolerant Computing, pp. 278-283, 1988.
[17] D. Fussell and S. Rangarajan, "Probabilistic Diagnosis of Multiprocessor Systems with Arbitrary Connectivity," Proc. 19th Int'l Symp. Fault-Tolerant Computing, pp. 560-565, 1989.
[18] S. Rangarajan and D. Fussell, “Diagnosing Arbitrarily Connected Parallel Computers with High Probability,” IEEE Trans. Computers, vol. 41, pp. 606-615, 1992.
[19] S. Rangarajan, D. Fussel, and M. Malek, “Built-In Testing of Integrated Circuit Wafers,” IEEE Trans. Computers, vol. 39, no. 2, pp. 195-205, Feb. 1990.
[20] E. Scheinerman, "Almost Sure Fault-Tolerance in Random Graphs," SIAM J. Computing, vol. 16, pp. 1,124-1,134, 1987.
[21] A.K. Somani and V.K. Agarwal, "Distributed Diagnosis Algorithms for Regular Interconnected Structures," IEEE Trans. Computers, vol. 41, no. 7, pp. 899-906, July 1992.

Index Terms:
System-level diagnosis, PMC model, processor grids, constant-degree diagnosis, diagnosis algorithm.
Citation:
Stefano Chessa, Piero Maestrini, "Correct and Almost Complete Diagnosis of Processor Grids," IEEE Transactions on Computers, vol. 50, no. 10, pp. 1095-1102, Oct. 2001, doi:10.1109/12.956094
Usage of this product signifies your acceptance of the Terms of Use.