This Article 
 Bibliographic References 
 Add to: 
Reliability Analysis of Large Software Systems: Defect Data Modeling
February 1990 (vol. 16 no. 2)
pp. 141-152

The author analyzes and models the software development process, and presents field experience for large distributed systems. Defect removal is shown to be the bottleneck in achieving the appropriate quality level before system deployment in the field. The time to defect detection, the defect repair time and a factor reflecting the introduction of new defects due to imperfect defect repair are some of the constants in the laws governing defect removal. Test coverage is a measure of defect removal effectiveness. A birth-death mathematical model based on these constants is developed and used to model field failure report data. The birth-death model is contrasted with a more classical decreasing exponential model. Both models indicate that defect removal is not a cost-effective way to achieve quality. As a result of the long latency of software defects in a system, defect prevention is suggested to be a far more practical solution to quality than defect removal.

[1] J. Abe, K. Sakurama, and H. Aiso, "An analysis of software project failures." inProc. 4th Int. Conf. Software Engineering, Munich, Sept. 17-19, 1979.
[2] J. A. Allers, A. H. Huizinga, J. A. Kukla, J. D. Sipes, and R. T. Yeh, "No. 5ESS--Strategies for reliability in a distributed processing environment." inProc. 13th Fault Tolerant Computing Symp., Milano, June 1983.
[3] "Quality: Theory and practice,"AT&T Tech. J., vol. 65, issue 2, Mar. 1986.
[4] X. Castillo and D. P. Sieworek, "A workload dependent software reliability prediction model," inProc. 12th Fault Tolerant Computing Symp., Santa Monica, CA, June 1982, pp. 279-286.
[5] R. Chillarege and R. K. Iyer, "The effect of system workload on error latency: An experimental study," inProc. ACM SICMETRICS Conf. Measurement and Modeling of Computer Systems, 1985, pp. 69-77.
[6] G. Clement and P. Giloth, "Evolution of fault tolerant computing in AT&T," inProc. One-Day Symp. Evolution of Fault Tolerant Comput., Baden, Austria, June 3, 1986.
[7] J. E. Gaffney, "Estimating the number of faults in code,"IEEE Trans. Software Eng., vol. SE-10, no. 4, pp. 459-464, July 1984.
[8] T. Gil,Principles of Software Engineering. New York: Wiley, 1988.
[9] A. L. Goel, "A time dependent error detection rate model for software reliability and other performance measures,"IEEE Trans. Rel., vol. R-20, pp. 206-211, July 1979.
[10] A. L. Goel, "Software reliability models: Assumptions, limitions, and applicability,"IEEE Trans. Software Eng., vol. SE-11, no. 12, pp. 1411-1423, Dec. 1985.
[11] W. Kremer, "Birth-death and bug counting,"IEEE Trans. Rel., vol. R-32, no. 1, pp. 37-46, Apr. 1983.
[12] Y. Levendel, "Quality and reliability estimation for large software projects using a time-dependent model," inProc. COMPSAC87, Tokyo, Japan, Oct. 1987, pp. 340-346.
[13] Y. Levendel, "Quality and reliability prediction: A time-dependent model with controllable testing coverage and repair intensity," inProc. 4th Israel Conf. Computer Systems and Software Engineering, Tel-Aviv, Israel, June 1989.
[14] Y. Levendel, "The manufacturing process of large software systems: The use of untampered metrics for quality control," presented at the Nat. Communication Forum 1989, Chicago, IL, Oct. 1989.
[15] M. Lipow, "Number of faults per line of code,"IEEE Trans. Software Eng., vol. SE-8, no. 5, pp. 437-439, July 1982.
[16] B. Littlewood and J. L. Verrall, "A Bayesian reliability growth model for computer software,"Appl. Stat., vol. 22, pp. 332-346, 1973.
[17] B. Littlewood, "What makes a reliable program: Few bugs or a small failure rate?" inProc. 1980 Nat. Computer Conf. 1980, AFIPS Press, 1980, pp. 707-713.
[18] M. Monachino, "Design verification system for large-scale LSI designs,"IBM J. Res. Develop., vol. 26, no. 1, pp. 89-99, Jan. 1982.
[19] J. D. Musa, "A theory of software reliability and its application,"IEEE Trans. Software Eng., vol. SE-1, no. 3, pp. 312-327, Sept. 1975.
[20] J. D. Musa, "Quantifying software validation: When to stop testing?"IEEE Software, pp. 19-27, May 1989.
[21] D. J. Rossetti and R. K. Iyer, "Software related failures on the IBM 3081: A relationship with system utilization," inProc. COMPSAC 82, Chicago, IL, Nov. 1982, pp. 45-54.

Index Terms:
reliability analysis; defect data modeling; software development; large distributed systems; bottleneck; defect removal; birth-death mathematical model; field failure report data; quality; distributed processing; large-scale systems; program testing; software reliability.
Y. Levendel, "Reliability Analysis of Large Software Systems: Defect Data Modeling," IEEE Transactions on Software Engineering, vol. 16, no. 2, pp. 141-152, Feb. 1990, doi:10.1109/32.44378
Usage of this product signifies your acceptance of the Terms of Use.