This Article 
 Bibliographic References 
 Add to: 
Reliability Analysis in Distributed Systems
March 1988 (vol. 37 no. 3)
pp. 352-358
Reliability of a distributed processing system is an important design parameter that can be described in terms of the reliability of processing elements and communication links and also of the redundancy of programs and data files. The traditional terminal-pair reliability does not capture the redundancy of programs and files in a distributed system. Two reliability measures are introduced: dis

[1] J. A. Abraham, "An improved algorithm for network reliability,"IEEE Trans. Reliability, vol. R-23, pp. 58-61, Apr. 1979.
[2] K. K. Aggarwal and S. Rai, "Reliability evaluation in computer-communication networks,"IEEE Trans. Reliability, vol. R-30, PP. 32-35, Apr. 1981.
[3] M. O. Ball, "Computing network reliability,"Oper. Res., vol. 27, pp. 823-838, Jul.-Aug. 1979.
[4] K. P. Berman, T. A. Joseph, T. Raeuchle, and A. E. Abbadi, "Object management in distributed systems,"IEEE Trans. Software Eng., vol. SE-11, pp. 502-508, June 1985.
[5] T. C. K. Chou and J. A. Abraham, "Load redistribution under failure in distributed systems,"IEEE Trans. Comput., vol. C-32, pp. 799- 808, Sept. 1983.
[6] D. W. Davies, E. Holler, E. D. Jensen, S. R. Kimbleton, B. W. Lampson, G. LeLann, K. J. Thurber, and R. W. Watson, "Distributed systems-architecture and implementation," inLecture Notes in Computer Science, vol. 105, Berlin, Germany: Springer-Verlag, 1981.
[7] J. Dion, "The Cambridge file server,"OSR, vol. 14, no. 4, pp. 26-35, 1980.
[8] P. Enslow, "What is a distributed data processing system,"IEEE Computer, vol. 11, Jan. 1978.
[9] J. Garcia-Molina, "Reliability issues for fully replicated distributed databases,"IEEE Computer, vol. 16, pp. 34-42, Sept. 1982.
[10] A. Grnarov, L. Kleinrock, and M. Gerla, "A new algorithm for symbolic reliability analysis of computer communication networks," inProc. Pacific Telecommun. Conf., Jan. 1980.
[11] A. Grnarov and M. Gerla, "Multiterminal reliability analysis of distributed processing systems," inProc. 1981 Int. Conf. Parallel Processing, Aug. 1981, pp. 79-86.
[12] S. Hariri and C. S. Raghavendra, "SYREL: A symbolic reliability algorithm based on path and cutset methods," USC Tech. Rep., 1984.
[13] S. Hariri, C. S. Raghavendra, and V. K. Prasanna Kumar, "Reliability measures for distributed processing systems," inProc. 6th Int. Conf. Distributed Comput. Syst., May 1986, pp. 564-571.
[14] J. P. Ignizio, D. F. Palmer, and C. M. Murphy, "A multicriteria approach to supersystem architecture definition,"IEEE Trans. Comput., vol. C-31, pp. 410-418, May 1982.
[15] K. B. Irani and N. G. Khabbaz, "A methodology for the design of communication networks and the distribution of data in distributed supercomputer systems,"IEEE Trans. Comput., vol. C-31, pp. 420- 434, May 1982.
[16] W. H. Kohler, "A survey of techniques for synchronization and recovery in decentralized computer systems,"ACM Computing Surveys, vol. 13, pp. 149-182, June 1981.
[17] L. Lamport, "Time, clocks, and the ordering of events in a distributed system,"Commun. ACM, vol. 21, no. 7, pp. 558-565, July 1978.
[18] P. M. Lin, B. J. Leon, and T. C. Huang, "A new algorithm for symbolic system reliability analysis,"IEEE Trans. Reliability, vol. R-25, pp. 2-15, April. 1976.
[19] M. S. McKendry, J. E. Allchin, and W. C. Thibult, "Architecture for a global operating system," inProc. IEEE INFCOM83, Apr. 1983.
[20] D. A. Menasce and R. R. Muntz, "Locking and deadlock detection in distributed data bases,"IEEE Trans. Software Eng., vol. SE-5, May 1979.
[21] R. E. Merwin and M. Mirhakak, "Derivation and use of a survivability criterion for DDP systems," inProc. 1980 Nat. Comput. Conf., May 1980, pp. 139-146.
[22] V. K. Prasanna Kumar, S. Hariri, and C. S. Raghavendra, "Distributed program reliability analysis,"IEEE Trans. Software Eng., pp. 42-50, Jan. 1986.
[23] D. A. Rennels, "Distributed fault-tolerant computer systems,"Computer, vol. 13, pp. 55-65, Mar. 1980.
[24] J. A. Stankovic, "A perspective on distributed computer systems,"IEEE Trans. Comput., vol. C-33, pp. 1102-1115, Dec. 1984.
[25] A. Satyanarayana, "A unified formula for analysis of some network reliability problems,"IEEE Trans. Reliability, vol. R-31, pp. 23-32, Apr. 1982.

Index Terms:
distributed systems; design parameter; communication links; redundancy; data files; distributed program reliability; graph traversal; distributed processing; fault tolerant computing.
C.S. Raghavendra, V.K.P. Kumar, S. Hariri, "Reliability Analysis in Distributed Systems," IEEE Transactions on Computers, vol. 37, no. 3, pp. 352-358, March 1988, doi:10.1109/12.2173
Usage of this product signifies your acceptance of the Terms of Use.