
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
S. Rangarajan, D. Fussell, "Diagnosing Arbitrarily Connected Parallel Computers with High Probability," IEEE Transactions on Computers, vol. 41, no. 5, pp. 606615, May, 1992.  
BibTex  x  
@article{ 10.1109/12.142687, author = {S. Rangarajan and D. Fussell}, title = {Diagnosing Arbitrarily Connected Parallel Computers with High Probability}, journal ={IEEE Transactions on Computers}, volume = {41}, number = {5}, issn = {00189340}, year = {1992}, pages = {606615}, doi = {http://doi.ieeecomputersociety.org/10.1109/12.142687}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Computers TI  Diagnosing Arbitrarily Connected Parallel Computers with High Probability IS  5 SN  00189340 SP606 EP615 EPD  606615 A1  S. Rangarajan, A1  D. Fussell, PY  1992 KW  parallel computers; probabilistic fault diagnosis; multiple tests; probabilistic diagnosis algorithms; asymptotic behavior; multiprocessor system; immediate neighbors; homogeneous parallel architecture; testing processors; fault tolerant computing; parallel algorithms; parallel architectures; parallel machines; probability. VL  41 JA  IEEE Transactions on Computers ER   
A practical model for probabilistic fault diagnosis is presented. Unlike PMCbased models, the model allows testers to conduct multiple tests on the same processor. This allows the design of efficient probabilistic diagnosis algorithms with good asymptotic behavior, with minimal constraints on the connection structure of the multiprocessor system, in contrast to other deterministic and probabilistic approaches. In practical cases, the number of immediate neighbors of any processor need be no greater than two, which implies that the algorithm can be applied to any practical homogeneous parallel architecture. It is also shown how to make efficient use of tests by allowing the number of testing processors, and the number of tests performed by a processor to be traded off in achieving asymptotically accurate diagnosis.
[1] D.M. Blough, G.F. Sullivan, and G.M. Masson, "Almost certain diagnosis for intermittently faulty systems," inProc. 18th Int. Symp. FaultTolerant Comput., 1988, pp. 260271.
[2] D.M. Blough, "Fault detection and diagnosis in multiprocessor systems," Ph.D. dissertation, The Johns Hopkins Univ., Baltimore, MD, 1988.
[3] H. Chernoff, "A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations,"Ann. Math. Statist., vol. 23, pp. 493507, 1952.
[4] A. Dahbura, K. K. Sabnani, and L. L. King, "The comparison approach to multiprocessors fault diagnosis,"IEEE Trans. Comput., vol. C36, pp. 373378, Mar. 1987.
[5] D. Fussell and S. Rangarajan, "Probabilistic diagnosis of multiprocessor systems with arbitrary connectivity," inProc. 19th Int. Symp. Fault Tolerant Comput., 1989, pp. 560565.
[6] S. L. Hakimi and A. T. Amin, "Characterization of the connection assignment of diagnosable systems,"IEEE Trans. Comput., pp. 8688, Jan. 1974.
[7] S. L. Hakimi and K. Y. Chwa, "Schemes for fault tolerant computing: A comparison of modularly redundant and tdiagnosable systems,"Inform. Contr., vol. 49, pp. 212238, June 1981.
[8] J. Maeng and M. Malek, "A comparison connection assignment for self diagnosis of multiprocessor systems," inProc. 11th Int. Conf. Fault Tolerant Comput., June 1981, pp. 173175.
[9] S. N. Maheshwari and S. L. Hakimi, "On models for diagnosable systems and probabilistic fault diagnosis,"IEEE Trans. Comput., pp. 228236, Mar. 1976.
[10] F. P. Preparata, G. Metze, and R. T. Chien, "On the connection assignment problem of diagnosable systems,"IEEE Trans. Electron. Comput., vol. EC16, pp. 848854, Dec. 1967.
[11] S. Rangarajan, "Cooperative decision based fault diagnosis," Ph.D. dissertation, Dep. Comput. Sci., Univ. Texas at Austin, Dec. 1990.
[12] S. Rangarajan and D. Fussell, "A probabilistic method for fault diagnosis of multiprocessor systems," inProc. 18th Int. Symp. FaultTolerant Comput., 1988, pp. 278283.
[13] S. Rangarajan and D. Fussell, "Probabilistic diagnosis algorithms tailored to system topology," inProc. 21st Int. Symp. Fault Tolerant Comput., June 1991, pp. 230237.
[14] E. Scheinerman, "Almost sure fault tolerance in random graphs,"SIAM J. Comput., vol. 16, pp. 11241134, Dec. 1987.
[15] J. E. Smith, "Universal system diagnosis algorithms,"IEEE Trans. Comput., pp. 374378, May 1979.