This Article 
 Bibliographic References 
 Add to: 
On Distributed Computing Systems Reliability Analysis Under Program Execution Constraints
January 1994 (vol. 43 no. 1)
pp. 87-97

Presents an algorithm for computing the reliability of distributed computing systems (DCS). The algorithm, called the Fast Reliability Evaluation Algorithm, is based on the factoring theorem employing several reliability preserving reduction techniques. The effect of file distributions, program distributions, and various topologies on reliability of the DCS is studied in detail using the proposed algorithm. Compared with existing algorithms on various network topologies, file distributions, and program distributions, the proposed algorithm is much more economical in both time and space. To compute the distributed program reliability, the ARPA network is studied to illustrate the feasibility of the proposed algorithm.

[1] D. P. Agrawal, "Advanced computer architecture," Tutorial Text, Computer Society Press, 386 pp., 1986.
[2] T. C. K. Chou and J. A. Abraham, "Load redistribution under failure in distributed systems,"IEEE Trans. Comput., vol. C-32, pp. 799-808, Sept. 1983.
[3] D. W. Davies, E. Holler, E. D. Jensen, S. R. Kimbleton, B. W. Lampson, G. Lelann, K. J. Thurber, and R. W. Watson, "Distributed systems architecture and implementation," inLecture Notes in Computer Science, vol. 105. Berlin, Germany: Springer-Verlag, 1981.
[4] P. Enslow, "What is a distributed data processing system?,"IEEE Computer, vol. 11, Jan. 1978.
[5] J. Garcia-Molina, "Reliability issues for fully replicated distributed database,"IEEE Computer, vol. 16, pp. 34-42, Sept. 1982.
[6] V. K. Prasanna Kumar, S. Hariri, and C. S. Raghavendra, "Distributed program reliability analysis,"IEEE Trans. Software Eng., pp. 42-50, Jan. 1986.
[7] A. Satyanarayna and J. N. Hagstrom, "New Algorithm for Reliability Analysis of Multiterminal Networks,"IEEE Trans. Reliability, vol. R-30, pp. 325-333, Oct. 1981.
[8] R. E. Merwin and M. Mirherkerk, "Derivation and use of survivability criterion for DDP systems," inProc. 1980 Nat. Comput. Conf., May 1980, pp. 139-146.
[9] K. K. Aggrawal and S. Rai, "Reliability evaluation in computer-communication networks,"IEEE Trans. Reliability, vol. R-30, pp. 32-35, Apr. 1981.
[10] A. Grnarov and M. Gerla, "Multiterminal reliability analysis of distributed processing system," inProc. 1981 Int. Conf. Parallel Processing, Aug. 1986, pp. 79-86.
[11] R. Kevin Wood, "Factoring algorithms for computingK-terminal network reliability,"IEEE Trans. Reliability, vol. R-35, pp. 269-278, Aug. 1986.
[12] S. Hariri and C. S. Raghavendra, "SYREL: A symbolic reliability algorithm based on path and cutset methods," USC Tech. Rep., 1984.
[13] A. Kumar, S. Rai, and D. P. Agrawal, "Reliability evaluation algorithms for distributed systems," inProc. IEEE INFOCOM 88, 1988, pp. 851-860.
[14] A. Kumar, S. Rai, and D. P. Agrawal, "On computer communication network reliability under program execution constraints,"IEEE J. Selected Areas Commun., vol. 6, no. 8, pp. 1393-1399, Oct. 1988.
[15] F. Moskowitz, "The analysis of redundancy networks,"AIEE Trans. (Commun. Electron.), vol. 29, pp. 627-632, 1958.
[16] M. O. Ball, "Computing network reliability,"Opt. Res., vol. 27, pp. 132-143.
[17] M. K. Chang, "A graph theoretic appraisal of the complexity of network reliability algorithms," Ph.D. dissertation, Dept. of IEOR, Univ. of California, Berkeley, 1981.
[18] A. Satyanarayana and M. K. Chang, "Network reliability and the factoring theorem,"Networks, vol. 13, pp. 107-120, 1983.
[19] R. K. Wood, "A factoring algorithm using polygon-to-chain reductions for computingK-terminal network reliability,"Networks, vol. 15, pp. 173-190, 1985.
[20] C. S. Raghavendra, V. K. Prasnna Kumar, and S. Hariri, "Reliability analysis in distributed system,"IEEE Trans. Comput., vol. 37, pp. 352-358, Mar. 1988.
[21] D. J. Chen, "On the reliability analysis of the distributed computing system," Comput Sci. Inform. Engineering, National Chiao Tung Univ., China, Tech. Rep. CSI-1991-005, July, 1991.

Index Terms:
distributed processing; fault tolerant computing; programming theory; distributed computing systems; systems reliability analysis; program execution constraints; Fast Reliability Evaluation Algorithm; factoring theorem; file distributions; program distributions; network topologies; distributed program; graph theory; spanning tree; reliability-preserving reduction.
Deng-Jyi Chen, Min-Sheng Lin, "On Distributed Computing Systems Reliability Analysis Under Program Execution Constraints," IEEE Transactions on Computers, vol. 43, no. 1, pp. 87-97, Jan. 1994, doi:10.1109/12.250612
Usage of this product signifies your acceptance of the Terms of Use.