This Article 
 Bibliographic References 
 Add to: 
Computationally Efficient and Numerically Stable Reliability Bounds for Repairable Fault-Tolerant Systems
March 2002 (vol. 51 no. 3)
pp. 254-268

The transient analysis of large continuous time Markov reliability models of repairable fault-tolerant systems is computationally expensive due to model stiffness. In this paper, we develop and analyze a method to compute bounds for a measure defined on a particular, but quite wide, class of continuous time Markov models, encompassing both exact and bounding continuous time Markov reliability models of fault-tolerant systems. The method is numerically stable and computes the bounds with well-controlled and specifiable-in-advance error. Computational effort can be traded off with bounds accuracy. For a class of continuous time Markov models, class $\rm C^{\prime\prime}$, including typical failure/repair reliability models with exponential failure and repair time distributions and repair in every state with failed components, the method can yield reasonably tight bounds at a very small computational cost. The method builds upon a recently proposed numerical method for the transient analysis of continuous time Markov models called regenerative randomization.

[1] Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, M. Abramowitz and I.A. Stegun, eds. Dover, 1965.
[2] A. Bobbio and K. Trivedi, “An Aggregation Technique for the Transient Analysis of Stiff Markov Chains,” IEEE Trans. Computers, vol. 35, pp. 803-814, 1986.
[3] A. Bobbio and M. Telek, “A Benchmark for PH Estimation Algorithms: Results for Acyclic-PH,” Comm. Statistics—Stochastic Models, vol. 10, no. 3, pp. 661-677, 1994.
[4] P.N. Bowerman, R.G. Nolty, and E.M. Scheuer, “Calculation of the Poisson Cumulative Distribution Function,” IEEE Trans. Reliability, vol. 39, pp. 158-161, 1990.
[5] J.A. Carrasco, “Transient Analysis of Large Markov Models with Absorbing States Using Regenerative Randomization,” Technical Report DMSD_99_2, Universitat Politècnica de Catalunya, Jan. 1999, available atftp://ftp-eel.upc.estechreports.
[6] J.A. Carrasco, “Computation of Bounds for Transient Measures of Large Rewarded Markov Models using Regenerative Randomization,” Technical Report DMSD_99_4, Universitat Politècnica de Catalunya, May 1999, available atftp://ftp-eel.upc.estechreports.
[7] E. Çinlar, Introduction to Stochastic Processes. Prentice Hall, 1975.
[8] B.L. Fox and P.W. Glynn, "Computing Poisson Probabilities," Comm. ACM, vol. 31, pp. 440-445, 1988.
[9] B.W. Johnson, Design and Analysis of Fault-Tolerant Digital Systems, pp. 394-402. Reading, Mass.: Addison-Wesley, June 1989.
[10] M. Kijima, Markov Processes for Stochastic Modeling. Cambridge: Univ. Press, 1997.
[11] L. Knüsel, “Computation of the Chi-Square and Poisson Distribution,” SIAM J. Scientific and Statistical Computing, vol. 7, no. 3, pp. 1022-1036, July 1986.
[12] M. Malhotra, J.K. Muppala, and K.S. Trivedi, “Stiffness-Tolerant Methods for Transient Analysis of Stiff Markov Chains,” Microelectronics and Reliability, vol. 34, no. 11, pp. 1825-1841, Nov. 1994.
[13] M. Malhotra, “A Computationally Efficient Technique for Transient Analysis of Repairable Markovian Systems,” Performance Evaluation, vol. 24, no. 1-2, pp. 311-331, 1995.
[14] B. Melamed and M. Yadin, “Randomization Procedures in the Computation of Cumulative-Time Distributions over Discrete State Markov Processes,” Operations Research, vol. 32, no. 4, pp. 926-944, July-Aug. 1984.
[15] D.R. Miller, “Reliability Calculation Using Randomization for Markovian Fault-Tolerant Computing Systems,” Proc. 13th IEEE Int'l Symp. Fault-Tolerant Computing (FTCS-13), pp 284-289, June 1983.
[16] A.P. Moorsel and W.H. Sanders, “Adaptive Uniformization,” Comm. Statistics—Stochastic Models, vol. 10, no. 3, pp. 619-648, 1994.
[17] A.P.A. van Moorsel and W.H. Sanders, Transient Solution of Markov Models by Combining Adaptive&Standard Uniformization IEEE Trans. Reliability, vol. 46, no. 3, pp. 430-440, Sept. 1997.
[18] F. Neuts, Matrix-Geometric Solutions in Stochastic Models. An Algorithmic Approach. Dover, 1994.
[19] A. Reibman and K.S. Trivedi, “Numerical Transient Analysis of Markov Models,” Computers and Operations Research, vol. 15, pp. 19-36, 1988.
[20] S.M. Ross, Stochastic Processes. John Wiley&Sons, 1983.
[21] B. Sericola, Availability Analysis of Repairable Computer Systems and Stationarity Detection IEEE Trans. Computers, vol. 48, no. 11, pp. 1166-1172, Nov. 1999.
[22] M.R. Spiegel, Mathematical Handbook of Formulas and Tables. McGraw-Hill, 1970.

Index Terms:
fault-tolerant systems, repairable systems, reliability, continuous time Markov models, bounds, randomization
J.A. Carrasco, "Computationally Efficient and Numerically Stable Reliability Bounds for Repairable Fault-Tolerant Systems," IEEE Transactions on Computers, vol. 51, no. 3, pp. 254-268, March 2002, doi:10.1109/12.990125
Usage of this product signifies your acceptance of the Terms of Use.