This Article 
 Bibliographic References 
 Add to: 
Dynamic Testing Strategy for Distributed Systems
March 1989 (vol. 38 no. 3)
pp. 356-365
Fault diagnosis is treated as two distinct processes: fault discovery and dissemination of diagnostic information. Previous research determined what level of self-diagnosability a given set of test in a homogeneous system achieves, using a model in which only node failures occur and test coverage is complete. Adopting the same model, a new methodology is presented that minimizes the overhead as

[1] P. Ciompi, F. Grandoni, and L. Simoncini, "Distributed diagnosis in multiprocessor systems: The MuTEAM approach," inProc. 11th Fault-Tolerant Comput. Symp., June 1981, pp. 25-29.
[2] A. T. Dahbura and G. M. Masson, "AnO(n2.5fault identification algorithm for diagnosable systems,"IEEE Trans. Comput., vol. C- 33, pp. 486-492, June 1984.
[3] N. Deo,Graph Theory with Applications to Engineering and Computer Science. Englewood Cliffs, NJ: Prentice-Hall, 1974.
[4] W. Diffie and M. Hellman, "New directions in cryptography,"IEEE Trans. Inform. Theory, vol. IT-22, pp. 644-654, 1976.
[5] D. Dolev, "The Byzantine generals strike again,"J. Algorithms, vol. 3, pp. 14-30, 1982.
[6] S. Even,Graph Algorithms. Rockville, MD: Computer Science Press, 1979.
[7] S. H. Hosseini, "Fault-tolerance in distributed computing systems and database," Ph.D. dissertation, Dep. Elec. Eng. Comput. Sci., Univ. Iowa, Aug. 1982.
[8] S. H. Hosseini, J. G. Kuhl, and S. M. Reddy, "A diagnosis algorithm for distributed computing systems with dynamic failure and repair,"IEEE Trans. Comput., vol. C-33, pp. 223-233, Mar. 1984.
[9] S. Karunanithi and A. Friedman, "Analysis of digital systems using a new measure of system diagnosis,"IEEE Trans. Comput., vol. C-28, pp. 121-133, Feb. 1979.
[10] C. Kime, "System diagnosis," inFault-Tolerant Computing: Theory and Techniques, D. K. Pradhan, Ed. Englewood Cliffs, NJ: Prentice-Hall, 1985.
[11] S. E. Kreutzer and S. L. Hakimi, "Distributed diagnosis and the system user,"IEEE Trans. Comput., vol. C-37, pp. 71-78, Jan. 1988.
[12] J. G. Kuhl and S. M. Reddy, "Distributed fault-tolerance for large multiprocessor system," inProc. 1980 Comput. Architecture Conf., France, May 1980.
[13] J. G. Kuhl and S. M. Reddy, "Fault diagnosis in fully distributed systems," inProc. 11th Fault-Tolerant Comput. Symp., 1981, pp. 100-105.
[14] L. Lamport, R. Shostak, and M. Pease, "The Byzantine Generals Problem,"ACM Trans. Programming Languages and Systems, Vol. 4, No. 3, July 1982, pp. 382-401.
[15] W. E. Leland and M. H. Solomon, "Dense trivalent graphs for processor interconnection,"IEEE Trans. Comput., vol. C-31, pp. 219-222, Mar. 1982.
[16] C. Liaw, S. Su, and Y. Malaiya, "Self-diagnosis of nonhomogeneous distributed systems," inProc. 12th Fault-Tolerant Comput. Symp., 1982, pp. 349-352.
[17] S. N. Maheshwari and S. L. Hakimi, "On models for diagnosable systems and probabilistic fault diagnosis,"IEEE Trans. Comput., vol. C-25, pp. 228-236, Mar. 1976.
[18] S. Mallela and G. M. Masson, "Diagnosable systems for intermittent faults,"IEEE Trans. Comput., vol. C-27, pp. 360-366, June 1978.
[19] F. J. Meyer and D. K. Pradhan, "Dynamic testing strategy for distributed systems," Tech. Rep., Univ. of Massachusetts, Amherst, Jan. 1985.
[20] F. J. Meyer and D. K. Pradhan, "Dynamic testing strategy for distributed systems," inProc. 15th Fault-Tolerant Comput. Symp., 1985, pp. 84-90.
[21] F. J. Meyer and D. K. Pradhan, "Improvements and extensions to a dynamic testing strategy for distributed systems," Tech. Rep., Univ. of Massachusetts, Amherst, Sept. 1986.
[22] F. P. Preparata, G. Metze, and R. T. Chien, "On the connection assignment problem of diagnosable systems,"IEEE Trans. Electron. Comput., vol. EC-16, pp. 848-854, Dec. 1967.
[23] K. G. Shin and P. Ramanathan, "Diagnosis of processors with Byzantine faults in a distributed computing system," inProc. 17th Fault-Tolerant Comput. Symp., July 1987, pp. 55-60.
[24] A. K. Somani, V. K. Agrawal, and D. Avis, "A generalized theory for system level diagnosis,"IEEE Trans. Comput., vol. C-36, pp. 538-546, May 1987.

Index Terms:
dynamic testing strategy; diagnostic information dissemination; distributed systems; fault discovery; node failures; periodic testing; distributed processing; fault tolerant computing.
F.J. Meyer, D.K. Pradhan, "Dynamic Testing Strategy for Distributed Systems," IEEE Transactions on Computers, vol. 38, no. 3, pp. 356-365, March 1989, doi:10.1109/12.21122
Usage of this product signifies your acceptance of the Terms of Use.