This Article 
 Bibliographic References 
 Add to: 
An Analysis of a Reliability Model for Repairable Fault-Tolerant Systems
March 1993 (vol. 42 no. 3)
pp. 327-339

The ARIES reliability model, which models a class of repairable and nonrepairable fault-tolerant systems by a continuous-time Markov chain and uses the Lagrange-Sylvester interpolation formula to directly compute the exponential of the state transition rate matrix (STRM) that appears in the solution of the Markov chain, is discussed. The properties of the STRM for ARIES repairable systems are analyzed. Well-established results in matrix theory are used to find an efficient solution for reliability computation when the eigenvalues of the STRM are distinct. A class of systems that ARIES models for which the solution technique is inapplicable is identified. Several transformations which are known to be numerically stable are used in the solution method. The solution method also offers a facility for incrementally computing reliability when the number of spares in the fault-tolerant system is increased by one.

[1] S. Barnett,Matrix Methods for Engineers and Scientists. New York: McGraw-Hill, 1979.
[2] R. H. Bartels and G. W. Stewart, "Solution of the matrix equation AX+XB = C,"Commun. ACM, vol. 15, no. 9, Sept. 1972.
[3] S. J. Bavuso, J. B. Dugan, K. S. Trivedi, E. M. Rothman, and W. E. Smith, "Analysis of typical fault-tolerant architectures using HARP,"IEEE Trans. Reliability, vol. R-36, no. 2, June 1987.
[4] R. W. Butler, "The Semi-Markov Unreliability Range Evaluator (SURE) Program," Tech. rep., NASA Langley Research Center, Hampton, VA, July 1984.
[5] A. Costes, J. E. Doucet, C. Landrault, and J. C. Laprie, "SURF: A program for dependability evaluation of complex fault-tolerant computing systems," inProc. 11th Fault-Tolerant Comput. Symp., 1981.
[6] G. Dahlquist and A. Bjorck,Numerical Methods. Englewood Cliffs, LA NJ: Prentice-Hall, 1974.
[7] F. R. Gantmacher,Matrix Theory: Vol. I. New York: Chelsea, 1977.
[8] F. R. Gantmacher,Matrix Theory: Vol. II. New York: Chelsea, 1977.
[9] R. M. Geist and K. S. Trivedi, "Ultra-reliability prediction for fault-tolerant computers,"IEEE Trans. Comput., vol. C-32, no. 12, Dec. 1983.
[10] G. H. Golub and C. Van Loan, "A Hessenberg-Schur method for the problem AX+XB = C,"IEEE Trans. Automat. Contr., vol. 24, 1979.
[11] G. H. Golub and C. Van Loan,Matrix Computations. Baltimore, MD: Johns Hopkins University Press, 1984.
[12] A. Goyal, W. C. Carter, E. de Souza e Silva, S. S. Lavenberg, and K. S. Trivedi, "A system availability estimator," inProc. IEEE Int. Symp. Fault-Tolerant Comput., 1986.
[13] S. V. Makam, A. A. Avizienis, and G. Grusas, UCLA ARIES 82 users' guide," Tech. rep., Comput. Sci. Dep., Univ. California, Los Angeles, Aug. 1982.
[14] R. A. Marie, A. L. Reibman, and K. S. Trivedi, "Transient analysis of acyclic Markov Chains,"Perform. Eval., vol. 7, 1987.
[15] F. P. Mathur and A. Avizienis, "Reliability analysis and architecture of a hybrid-redundant digital system: Generalized triple modular redundancy with repair," inProc. AFIPS SJCC, 1970.
[16] C. Moler and C. Van Loan, "Nineteen dubious ways of computing the exponential of a matrix,"SIAM Rev., vol. 20, no. 4, 1978.
[17] M. Mulazzani and K. S. Trivedi, "Dependability prediction: Comparison of tools and techniques," inProc. IFAC SAFECOMP '86, Sarlat, France, 1986.
[18] Y. W. Ng, "Reliability analysis and modeling for fault-tolerant computers," Ph.D. dissertation, Dep. Comput. Sci., Univ. California, Los Angeles, Sept. 1976.
[19] Y. W. Ng and A. A. Avizienis, "A unified reliability model for fault-tolerant computers,"IEEE Trans. Comput., vol. C-29, no. 11, Nov. 1980.
[20] E. E. Osborne, "On pre-conditioning of matrices,"J. ACM, vol. 7, 1960.
[21] B. N. Parlett and C. Reinsch, "Balancing a matrix for calculations of eigenvalues and eigenvectors,"Numer. Math., vol. 13, 1969.
[22] J. H. Wilkinson,The Algebraic Eigenvalue Problem. London, England: Oxford University Press, 1965.

Index Terms:
reliability model; repairable fault-tolerant systems; ARIES; continuous-time Markov chain; Lagrange-Sylvester interpolation formula; state transition rate matrix; matrix theory; eigenvalues; spares; fault tolerant computing; Markov processes; matrix algebra; reliability.
M. Balakrishnan, C.S. Raghavendra, "An Analysis of a Reliability Model for Repairable Fault-Tolerant Systems," IEEE Transactions on Computers, vol. 42, no. 3, pp. 327-339, March 1993, doi:10.1109/12.210175
Usage of this product signifies your acceptance of the Terms of Use.