This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
X-Ware Reliability and Availability Modeling
February 1992 (vol. 18 no. 2)
pp. 130-147

The problem of modeling a system's reliability and availability with respect to the various classes of faults (physical and design, internal and external) which may affect the service delivered to its users is addressed. Hardware and software models are currently exceptions in spite of the user's requirements; these requirements are expressed in terms of failures independently of their sources, i.e., the various classes of faults. The causes of this situation are analyzed; it is shown that there is no theoretical impediment to deriving such models, and that the classical reliability theory can be generalized in order to cover both hardware and software viewpoints that are X-Ware.

[1] E. N. Adams, "Optimizing preventive service of software products,"IBM J. Res. and Develop., vol. 28, no. 1, pp. 2-14, Jan. 1984.
[2] T. Anderson and P. A. Lee,Fault Tolerance Principles and Practice. Englewood Cliffs, NJ: Prentice-Hall, 1981.
[3] J. Arlat, K. Kanoun, and J. C. Laprie, "Dependability evaluation of software fault-tolerance," inProc. 18th IEEE Int. Symp. Fault Tolerant Computing (FTCS-18), Tokyo, Japan, June 1988, pp. 142-147.
[4] T. F. Arnold, "The concept of coverage and its effect on the reliability model of repairable systems,"IEEE Trans. Computers, vol. C-22, pp. 251-254, June 1973.
[5] H. Ascher and H. Feingold, "Repairable systems reliability: modeling, inference, misconceptions and their causes" (Lecture Notes in Statistics, vol. 7), 1984.
[6] R. L. Aveyard and F. T. Man, "A study on the reliability of the circuit maintenance system 1-B,"Bell Syst. Tech. J., vol. 59, pp. 1317-1332, Oct. 1980.
[7] A. Avizienis and J. C. Laprie, "Dependable computing: from concepts to design diversity,"Proc. IEEE, vol. 74, pp. 629-638, May 1986.
[8] R. E. Barlow and F. Proschan,Statistical Theory of Reliability and Life Testing. New York: Holt, 1975.
[9] H. A. Bauer, L. M. Croxall, and E. A. Davis, "The 5ESS switching system: system test, first-office application, and early field experience,"AT&T Tech. J., vol. 64, no. 6, pp. 1503-1522.
[10] B. Beyaert, G. Florin, P. Lonc, and S. Natkin, "Evaluation of computer systems dependability using stochastic Petri nets," inProc. 11th IEEE Int. Symp. Fault-Tolerant Computing (FTCS-11)(Portland, Maine), June 1981, pp. 79-81.
[11] A. Birolini, "Some applications of regenerative stochastic processes to reliability theory, part one: tutorial introduction,"IEEE Trans. Rel., vol. R-23, pp. 186-194, Aug. 1974.
[12] A. Bobbio and K. S. Trivedi, "An aggregation technique for the transient analysis of stiff Markov chains,"IEEE Trans. Comput., vol. C-35, pp. 803-814, Sept. 1986.
[13] W. G. Bouricius, W. C. Carter, and P. R. Schneider, "Reliability modeling techniques for self-repairing computer systems," inProc. 24th ACM Nat. Conf., 1969, pp. 295-309.
[14] W. C. Carteret al., "Design techniques for modular architectures for reliable computer systems," IBM, Yorktown Heights, NY, IBM T. J. Watson Rep. No. 70.208.0002, 1970.
[15] X. Castillo and D. P. Siewiorek, "Workload, performance, and reliability of digital computing systems," inProc. 11th IEEE Int. Symp. on Fault Tolerant Computing (FTCS-11)(Portland, Maine), June 1981, pp. 84-89.
[16] R. C. Cheung, "A user-oriented software reliability model,"IEEE Trans. Software Eng., vol. SE-6, pp. 118-125, Mar. 1980.
[17] G. F. Clement and P. K. Giloth, "Evolution of fault tolerant switching systems in AT&T," inThe Evolution of Fault-Tolerant Computing, Avizienis, H. Kopetz, and J. C. Laprie, Eds. Wien: Springer-Verlag, 1987, pp. 37-54.
[18] A. Costes, C. Landrault, and J. C. Laprie, "Reliability and availability models for maintained systems featuring hardware failures and design faults,"IEEE Trans. Computers, vol. C-27, pp. 548-560, June 1978.
[19] P. J. Courtois,Decomposability: Queing and Computer System Application. New York: Academic, 1977.
[20] D. R. Cox,Renewal Theory: Methuen' Monographs on Applied Probability and Statistics, 1962.
[21] L. H. Crow, "Confidence interval procedures for reliability growth analysis," U.S. Army Material Syst. Anal. Activity, Aberdeen, MD, Tech. Rep., 1977.
[22] P. A. Currit, M. Dyer, and H. D. Mills, "Certifying the reliability of software,"IEEE Trans. Software Eng., vol. SE-12, no. 1, pp. 3-11, Jan. 1986.
[23] J. T. Duane, "Learning curve approach to reliability monitoring,"IEEE Trans. Aerosp. Electron. Syst., vol. 2, pp. 563-566, 1964.
[24] Eur. Space Agency, "Software reliability modeling study," Invitation to tender-AO/1-2039/87/NL/IW, Feb. 1988.
[25] B. V. Gnedenko, Y. K. Belyayev, and A. D. Solovyev,Mathematical Methods of Reliability Theory. New York: Academic, 1969.
[26] A. L. Goel and K. Okumoto, "Time-dependent error-detection rate model for software and other performance measures,"IEEE Trans. Rel., vol. R-28, pp. 206-211, Aug. 1979.
[27] J. N. Gray, "Why do computers stop and what can be done about it?" inProc. 5th Symp. on Reliability in Distributed Software and Database Syst.(Los Angeles), Jan. 1986, pp. 3-12.
[28] A. Grnarov, J. Arlat, and A. Avizienis, "On the performance of software fault tolerance strategies," inProc. 10th IEEE Int. Symp. Fault-Tolerant Computing (FTCS-10)(Kyoto), Oct. 1980, pp. 251-253.
[29] D. Gross and D. R. Miller, "The randomization technique as a modeling tool and solution procedure for transient Markov processes,"Oper. Res., vol. 32, no. 2, p. 343-361, 1984.
[30] H. Hecht, "Fault-tolerant software,"IEEE Trans. Rel., vol. R-28, pp. 227-232, Aug. 1979.
[31] H. Hecht and E. Fiorentino, "Reliability assessment of spacecraft electronics," inProc. 1987 Ann. Reliability and Maintainability Symp.
[32] R. K. Iyer, S. E. Butner, and E. J. McCluskey, "A statistical failure/load relationship: results of a multi-computer study,"IEEE Trans. Computers, vol. C-31, pp. 697-706, July 1982.
[33] Z. Jelinski and P. B. Moranda, "Software reliability research," inStatistical Methods for the Evaluation of Computer System Performance. New York: Academic, 1972, pp. 465-484.
[34] K. Kanoun and T. Sabourin, "Software dependability of a telephone switching system," inProc. 17th IEEE Int. Symp. on Fault Tolerant Computing (FTCS-17)(Pittsburgh, PA), June 1987, pp. 236-241.
[35] K. Kanoun, J. C. Laprie, and T. Sabourin, "A method for software reliability growth analysis and assessment," inProc. 1st Int. Workshop on Software Eng. and its Applications(Toulouse, France), Dec. 1988, pp. 859-878.
[36] K. Kanoun, "Software dependability growth: characterization, modeling, evaluation," Docteurès-Sciences thesis, Toulouse Polytech. Nat. Instit., Sept. 1989 (published as LAAS Rep. No. 89.320) (in French).
[37] K. Kanoun, M. Bastos Martini, and J. Moreira De Souza, "A method for software reliability analysis and prediction application to the TROPICOR switching system,"IEEE Trans. Software Eng., vol. 17, pp. 334-344, Apr. 1991.
[38] K. Kanoun and J. C. Laprie, "The role of trend analysis in software development and validation," inProc. IFAC Int. Conf. on Safety, Security and Reliability (SAFECOMP'91)(Trondheim, Norway), 1991.
[39] P. A. Keiller, B. Littlewood, D. R. Miller, and A. Sofer, "Comparison of software reliability predictions," inProc. 13th IEEE Int. Symp. Fault-Tolerant Computing (FTCS-13)(Milano, Italy), June 1983, pp. 128-134.
[40] B. A. Kozlov and U. A. Ushakov,Reliability Handbook, L. H. Koopmans and J. Rosenblat, Eds. New York: Holt, Rinehart,&Winston.
[41] J. C. Laprie, "Dependability modeling and evaluation of hardware-and-software systems," inProc. 2nd GI/NTG/GMR Conf. Fault Tolerant Computing, Bonn, Germany, Sept. 1984, pp. 202-215.
[42] J. C. Laprie, "Dependability evaluation of software systems in operation,"IEEE Trans. Software Eng., vol. SE-10, pp. 701-714, Nov. 1984.
[43] J. C. Laprie, "Dependable computing and fault tolerance: basic concepts and terminology," inProc. 15th Int. IEEE Symp. on Fault Tolerant Computing (FTCS-15)(Ann Arbor, MI), June 1985, pp. 2-11.
[44] J. C. Laprie, "Toward an X-ware reliability theory,"Technique et Sci. Inform., vol. 7, no. 3, pp. 315-330, 1987 (in French) (transl.: LAAS Rep. No. 86.376, Dec. 1986).
[45] J. C. Laprie, "Hardware-and-software dependability evaluation," inProc. IFIP 11th World Congress(San Francisco, CA), Aug. 1989, pp. 109-114.
[46] J. C. Laprie, "Dependability: a unifying concept for reliable computing and fault tolerance," inDependability of Resilient Computers, T. Anderson Ed. London: Blackwell, 1989, pp. 1-28.
[47] J. C. Laprie, C. Beounes, M. Kaaniche, and K. Kanoun, "The transformation approach to modeling and evaluation of the reliability and availability growth of systems," inProc. 20th IEEE Int. Symp. Fault Tolerant Computing (FTCS-20), Newcastle, England, June 1990, pp. 364-371.
[48] J. C. Laprie, K. Kanoun, C. Beounes, and M. Kaâniche, "The KAT (knowledge-action-transformation) approach to the modeling and evaluation of reliability and availability growth,"IEEE Trans. Software Eng., vol. 17, pp. 370-382, Apr. 1991.
[49] Y. Levendel, "Defects and reliability analysis of large software systems: Field experience," inProc. 19th IEEE Int. Symp. Fault Tolerant Computing (FTCS-19), Chicago, IL, June 1989, pp. 238-244.
[50] Y. Levendel, "Software quality improvement process: when to stop testing," inProc. of Software Engineering and its Applications(Toulouse, France), Dec. 1991, pp. 729-749.
[51] P. A. Lewis, "A branching Poisson process model for the analysis of computer failure patterns,"J. R. Statist. Soc. B, vol. 26, no. 3, pp. 398-456, 1964.
[52] B. Littlewood, "Software reliability model for modular program structure,"IEEE Trans. Rel., vol. R-30, pp. 313-320, Oct. 1981.
[53] B. Littlewood, "Stochastic reliability growth: a model for fault-removal in computer programs and hardware designs,"IEEE Trans. Rel., vol. R-30, pp. 313-320, Oct. 1981.
[54] B. Littlewood, "Forecasting software reliability," inSoftware Reliability Modeling and Identification, S. Bittanti, Ed. Berlin: Springer-Verlag, 1988, pp. 140-209.
[55] B. Littlewood, "Limits to evaluation of software dependability," inSoftware Reliability and Metrics, B. Littlewood and N. Fenton, Eds. New York: Elsevier, 1991.
[56] J. F. Meyer, "On evaluating the performability of degradable computing systems," inProc. 8th IEEE Int. Symp. on Fault Tolerant Computing (FTCS-8)(Toulouse), June 1978, pp. 44-49.
[57] J. F. Meyer and L. Wei, "Analysis of workload influence on dependability," inProc. 18th IEEE Int. Symp. on Fault Tolerant Computing (FTCS-18)(Tokyo), June 1988, pp. 84-89.
[58] S. R. McConnell, D. P. Siewiorek, and M. M. Tsao, "The measurement and analysis of transient errors in digital computer systems," inProc. 9th IEEE Int. Symp. on Fault Tolerant Computing (FTCS-9)(Madison, WI), June 1979, pp. 67-70.
[59] D. R. Miller, "Exponential order statistic models of software reliability growth,"IEEE Trans. Software Eng., vol. SE-12, no. 1, pp. 12-24, Jan. 1986.
[60] J. D. Musa, "A theory of software reliability and its application,"IEEE Trans. Software Eng., vol. SE-1, pp. 312-327, Sept. 1975.
[61] J. D. Musa and K. Okumoto, "A logarithmic Poisson execution time model for software reliability measurement," inProc. Compsac' 84, Chicago, IL, 1984, pp. 230-238.
[62] M. Ohba, "Software reliability analysis models,"IBM J. Res. Develop., vol. 21, no. 4, pp. 428-443, July 1984.
[63] P. M. Nagel and J. A. Skrivan, "Software reliability: repetitive run experimentation and modeling," NASA, Washington, DC, Rep. NASA CR-165836, Feb. 1982.
[64] A. Pages and M. Gondran,System Reliability. Paris: Eyrolles, 1980 (in French).
[65] D. L. Parnas, "On a buzzword: hierarchical structure," inProc. 1974 IFIP Cong., pp. 336-339.
[66] P. I. Pignal, "An analysis of hardware and software availability exemplified on the IBM 3725 communication controller,"IBM J. Res. Develop., vol. 32, no. 2, pp. 268-278, Mar. 1988.
[67] W. B. Rohn and T. F. Arnold, "Design for low expected downtime control systems," inProc. 4th Int. Conf. on Computer Commun.(Philadelphia, PA), June 1972, pp. 16-25.
[68] B. Roy,Modern Algebra and Graph Theory. Paris, Dunod, 1969 (in French).
[69] M. Shooman, "Operating testing and software reliability during program development," inProc. IEEE Symp. Comput. Software Rel.(New York), 1973, pp. 51-57.
[70] R. M. Smith, K. S. Trivedi, and A. V. Ramesh, "Performability analysis: measures, an algorithm, and a case study,"IEEE Trans. Computers, vol. 37, pp. 406-417, Apr. 1988.
[71] G. E. Stark, "Dependability evaluation of integrated hardware/software systems,"IEEE Trans. Rel., vol. R-36, pp. 440-444, 1987.
[72] P. Thévenod-Fosse, "Software validation by means of statistical testing: retrospect and future direction," inDependable Computing and Fault-Tolerant Systems, vol. 4, A. Avizienis and J. C. Laprie, Eds. Wien-New York: Springer-Verlag, pp. 25-48.
[73] Y. Tohma, K. Tokunaga, S. Nagase, and Y. Murata, "Structural approach to the estimation of the number of residual faults based on the hypergeometric distribution,"IEEE Trans. Software Eng., vol. SE-15, pp. 345-355, Mar. 1989.
[74] W. N. Toy, "Modular redundancy concept, problems and solutions," inProc. EPRI Seminar: Digital Control and Fault-Tolerant Computer Techn.(Scottsdale, AZ), Apr. 1985.
[75] J. J. Wallace and W. W. Barnes, "Designing for ultrahigh availability: the Unix RTR operating system,"Computer, pp. 31-39, Aug. 1984.
[76] S. Yamada and S. Osaki, "Reliability growth modeling for software error detection,"IEEE Trans. Rel., vol. R-32, pp. 475-478, 1983.
[77] S. Yamada and S. Osaki, "Software reliability growth modeling: models and assumptions,"IEEE Trans. Software Eng., vol. SE-11, pp. 1431-1437, Dec. 1985.

Index Terms:
availability modeling; faults; software models; classical reliability theory; software viewpoints; X-Ware; fault tolerant computing; performance evaluation; reliability theory; software reliability
Citation:
J.-C. Laprie, K. Kanoun, "X-Ware Reliability and Availability Modeling," IEEE Transactions on Software Engineering, vol. 18, no. 2, pp. 130-147, Feb. 1992, doi:10.1109/32.121755
Usage of this product signifies your acceptance of the Terms of Use.