This Article 
 Bibliographic References 
 Add to: 
A Unified Framework for the Performability Evaluation of Fault-Tolerant Computer Systems
March 1993 (vol. 42 no. 3)
pp. 312-326

The problem of evaluating the performability density and distribution of degradable computer systems is considered. A generalized model of performability is considered, wherein the dynamics of configuration modes are modeled as a nonhomogeneous Markov process, and the performance rate in each configuration mode can be time dependent. The key to the development of a unifying mathematical framework is the introduction of two related performability processes: the forward performability process over the interval (0,t), and the performability-to-go process over the interval (t,T), where T is the mission time. Stochastic differential equations techniques show that the joint density of the forward performability and configuration states satisfies a linear, hyperbolic partial differential equation (PDE) with time-dependent coefficients that runs forward in time, while the performability-to-go process satisfies an adjoint PDE running reverse in time. A numerical method for solving the PDEs is presented and is illustrated with examples.

[1] M. D. Beaudry, "Performance-related reilability measures for computing systems,"IEEE Trans. Comput., vol. C-27, pp. 540-547, June 1978.
[2] J. F. Meyer, "On evaluating performability of degradable computing systems,"IEEE Trans. Comput., vol. C-29, no. 8, pp. 720-731, Aug. 1980.
[3] B. R. Iyer, L. Donatiello and P. Heidelberger, "Analysis of performability for stochastic models of fault-tolerant systems,"IEEE Trans. Comput., vol. C-35, no. 10, Oct. 1986.
[4] B. Ciciani and V. Grassi, "Performability evaluation of fault-tolerant satellite systems,"IEEE Trans. Commun., vol. COM-35, no. 4, Apr. 1987.
[5] V. Grassi, L. Donatiello, and G. Iazeolla, "Performability Evaluation of Multicomponent Fault-Tolerant Systems,"IEEE Trans. Reliability, Vol. 37, June 1988, pp. 216-222.
[6] R. M. Smith, K. S. Trivedi, and A. V. Ramesh, "Performability analysis: Measures, an algorithm and a case study,"IEEE Trans. Comput., vol. C-37, no. 4, pp. 406-417, Apr. 1988.
[7] K. R. Pattipati and S. A. Shah, "On the computational aspects of performability models of fault-tolerant computer systems,"IEEE Trans. Comput., vol. C-39, no. 7, pp. 832-836, July 1990.
[8] E. de Souza e Silva and H. R. Gail, "Calculating cumulative operational time distributions of repairable computer systems,"IEEE Trans. Comput., vol. C-35, pp. 322-332, 1986.
[9] S. M. Ross,Introduction to Probability Models. New York: Academic, 1985.
[10] A. Goyal and A. N. Tantawi, "A measure of guaranteed availability and its numerical evaluation,"IEEE Trans. Comput., vol. C-37, no. 1, pp. 25-32, Jan. 1988.
[11] V. G. Kulkarni, V. F. Nicola, R. M. Smith, and K. S. Trivedi, "Numerical evaluation of performability measures and job completion time in repairable fault-tolerant systems," inProc. 1986 Int. Symp. Fault-Tolerant Comput., Vienna, Austria, 1986, pp. 252-257.
[12] A. E. Bryson and Y. Ho,Applied Optimal Control. New York: Wiley, 1969.
[13] I. I. Gihman, and A. V. Skorohod,Stochastic Differential Equations. New York: Springer, 1972.
[14] M. H. A. Davis, "Piecewise deterministic Markov processes: A general class of non-diffusion stochastic models,"J. Royal Statistical Soc. B., vol. 46, pp. 353-388, 1984.
[15] R. A. Howard,Dynamic Probabilistic Systems, Vols. I and II. New York: Wiley, 1971.
[16] D. P. Siewiorek and R. S. Swarz,The Theory and Practice of Reliable System Design. Bedford, MA: Digital Press, 1982.
[17] A. Goyal, A. N. Tantawi, and K. S. Trivedi, "A measure of guaranteed availability," IBM Res. Rep. RC 11341, Aug. 1985.
[18] B. L. Rozdestvenskii and N. N. Janenko,Systems of Quasilinear Equations and Their Applications to Gas Dynamics, American Mathematical Society, 1983.
[19] S. C. Chapra and R. P. Canale,Numerical Methods for Engineers. New York: McGraw-Hill, 1988.
[20] D. Gottlieb and S. A. Orszag,Numerical Analysis of Spectral Methods: Theory and Applications. Philadelphia, PA: SIAM, 1977.
[21] S. F. McCormick, Ed.,Multigrid Methods. Philadelphia, PA: SIAM, 1987.
[22] W. Hackbush,Multigrid Methods and Applications, Springer Series in Computational Mathematics, vol. 4, Berlin, Germany: Springer-Verlag, 1985.
[23] E. de Souza e Silva and H. R. Gail, "Calculating availability and performability measures of repairable computer systems using randomization,"J. ACM, vol. 36, no. 1, Jan. 1989.
[24] N. Viswanadham, Y. Narahari, and R. Ram, "Performability of automated manufacturing systems,"Control and Dynamic Systems, vol. 47, pp. 77-120, Academic, 1991.
[25] N. Viswanadham, K. R. Pattipati, and V. Gopalakrishna, "Performability studies of AMSs with multiple part types," invited paper at the1993 IEEE Robot. and Automat. Conf., Atlanta, GA, May 1993 (also to be submitted toIEEE Trans. Syst., Man, Cybern., Nov. 1992).
[26] R. M. Smith and K. S. Trivedi, "The analysis of computer systems using Markov reward models," inStochastic Models of Computer and Communication Systems, H. Takagi, Ed. Elsevier, 1989.
[27] Y. Li, "Analysis of Markov reward models of fault-tolerant computer systems," M. S. thesis, Dep. Elec. Syst. Eng. Univ. Connecticut, Storrs, CT 06269-3157, 1990.
[28] D. Kahner, C. Moler, and S. Nash,Numerical Methods and Software. Englewood Cliffs, NJ: Prentice-Hall, 1989.
[29] A. Reibman and K. S. Trivedi, "Transient analysis of cumulative measures of Markov model behavior,"Stochastic Models, vol. 5, no. 4, pp. 683-710, 1989.
[30] W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling,Numerical Recipes Cambridge, U.K.: Cambridge University Press, 1986.
[31] G. H. Golub and C. F. Van Loan,Matrix Computations, 2nd ed. Baltimore, MD: Johns Hopkins Press, 1989.
[32] D. P. Bhandarkar, "Analysis of memory interference in multiprocessors,"IEEE Trans. Comput., vol. C-24, no. 11, pp. 897-908, Nov. 1975.
[33] G. N. Cherkesov, "Semi-Markovian models of reliability of multi-channel systems with unreplenishable reserve of time,"Eng. Cybern., vol. 18, pp. 65-78, Mar. 1981.
[34] K. R. Pattipati, Y. Li, and H. A. P. Blom, "A unified framework for the performability evaluation of fault-tolerant computer systems," TR-92-12, Dep. Elec. Syst. Eng. Univ. Connecticut, Storrs, CT 06269- 3157.

Index Terms:
performability evaluation; fault-tolerant computer systems; performability density; degradable computer systems; configuration modes; nonhomogeneous Markov process; mathematical framework; forward performability process; performability-to-go; mission time; differential equations; hyperbolic partial differential equation; fault tolerant computing; Markov processes; partial differential equations; performance evaluation.
K.R. Pattipati, Y. Li, H.A.P. Blom, "A Unified Framework for the Performability Evaluation of Fault-Tolerant Computer Systems," IEEE Transactions on Computers, vol. 42, no. 3, pp. 312-326, March 1993, doi:10.1109/12.210174
Usage of this product signifies your acceptance of the Terms of Use.