This Article 
 Bibliographic References 
 Add to: 
Empirical Analysis of Software Fault Content and Fault Proneness Using Bayesian Methods
October 2007 (vol. 33 no. 10)
pp. 675-686
We present a methodology for Bayesian analysis of software quality. We cast our research in the broader context of constructing a causal framework that can include process, product and other diverse sources of information regarding fault introduction during the software development process. In this paper, we discuss the aspect of relating internal product metrics to external quality metrics. Specifically, we build a Bayesian network (BN) model to relate object-oriented software metrics to software fault content and fault proneness. Assuming that the relationship can be described as a generalized linear model, we derive parametric functional forms for the target node conditional distributions, in the BN. These functional forms are shown to be able to represent linear, Poisson and binomial logistic regression. The models are empirically evaluated using a public domain data set from a software subsystem. The results show that our approach produces statistically significant estimations, and that our overall modelling method performs no worse than existing techniques.

[1] S.R. Chidamber and C.F. Kemerer, “A Metrics Suite for Object-Oriented Design,” IEEE Trans. Software Eng., vol. 20, no. 6, pp. 476-493, June 1994.
[2] L. Briand, J. Wüst, J.W. Daly, and V. Porter, “Exploring the Relationships between Design Measures and Software Quality in Object-Oriented Systems,” J. Systems and Software, vol. 51, pp. 245-273, 2000.
[3] T. Khoshgoftaar, J.C. Munson, B.B. Bhattacharya, and G.D. Richardson, “Predictive Modeling Techniques of Software Quality from Software Measures,” IEEE Trans. Software Eng., vol. 18, no. 11, pp. 979-987, Nov. 1992.
[4] V.R. Basili, L.C. Briand, and W.L. Melo, “A Validation of Object-Oriented Design Metrics as Quality Indicators,” IEEE Trans. Software Eng., vol. 22, no. 10, pp. 751-761, Oct. 1996.
[5] N. Olhsson and H. Alberg, “Predicting Fault-Prone Software Modules in Telephone Switches,” IEEE Trans. Software Eng., vol. 22, no. 12, pp. 886-894, Dec. 1996.
[6] L.C. Briand, W.L. Melo, and J. Wüst, “Assessing the Applicability of Fault-Proneness Models across Object-Oriented Software Projects,” IEEE Trans. Software Eng., vol. 28, no. 7, pp. 706-720, July 2002.
[7] R. Subramanyam and M.S. Krishnan, “Empirical Analysis of CK Metrics for Object-Oriented Design Complexity: Implications for Software Defects,” IEEE Trans. Software Eng., vol. 29, no. 4, pp.297-310, Apr. 2003.
[8] L. Briand and J. Wüst, “Empirical Studies of Quality Models in Object-Oriented Systems,” Advances in Computers, vol. 56, 2002.
[9] Y. Zhou and H. Leung, “Empirical Analysis of Object-Oriented Design Metrics for Predicting High and Low Severity Faults,” IEEE Trans. Software Eng., vol. 32, no. 10, pp. 771-789, Oct. 2006.
[10] W.S. Humphrey, A Discipline for Software Engineering. Addison-Wesley, 1995.
[11] A.P. Nikora, “Software System Defect Content Prediction from Development Process and Product Characteristics,” PhD thesis, Dept. of Computer Science, Univ. of Southern California, May 1998.
[12] N. Nagappan and T. Ball, “Use of Relative Code Churn Measures to Predict System Defect Density,” Proc. Int'l Conf. Software Eng., 2005.
[13] N. Nagappan, T. Ball, and B. Murphy, “Using Historical In-Process and Product Metrics for Early Estimation of Software Failures,” Proc. Int'l Symp. Software Reliability Eng., 2006.
[14] N.E. Fenton and M. Neil, “A Critique of Software Defect Prediction Models,” IEEE Trans. Software Eng., vol. 25, no. 3, pp.1-15, Mar. 1999.
[15] N. Fenton, P. Krause, and M. Neil, “Software Measurement: Uncertainty and Causal Modeling,” IEEE Software, vol. 10, no. 2, Mar./Apr. 2002.
[16] M. Neil, P. Krause, and N.E. Fenton, “Software Quality Prediction Using Bayesian Networks,” Software Eng. with Computational Intelligence, T.M. Khoshgoftaar, ed. Kluwer, 2003.
[17] N.E. Fenton, M. Neil, P. Hearty, W. Marsh, D. Marquez, P. Krause, and R. Mishra, “Predicting Software Defects in Varying Development Lifecycles Using Bayesian Nets,” Information and Software Technology, vol. 49, pp. 32-43, Jan. 2007.
[18] G.J. Pai, “Probabilistic Software Quality Assessment,” PhD thesis, Dept. of Electrical and Computer Eng., Univ. of Virginia, Feb. 2007.
[19] F.V. Jensen, An Introduction to Bayesian Networks. Springer-Verlag, 1996.
[20] M. Neil and N.E. Fenton, “Predicting Software Quality Using Bayesian Belief Networks,” Proc. 21st Ann. Software Eng. Workshop, Dec. 1996.
[21] L. Rosenberg and L. Hyatt, “Software Quality Metrics for Object-Oriented System Environments,” Technical Report SATC-TR-95-1001, NASA, 1995.
[22] D. Heckerman, “A Tutorial on Learning with Bayesian Networks,” Learning in Graphical Models, M. Jordan, ed., MIT Press, 1999.
[23] T. Gyimóthy, R. Ferenc, and I. Siket, “Empirical Validation of Object-Oriented Metrics on Open Source Software for Fault Prediction,” IEEE Trans. Software Eng., vol. 31, no. 10, pp. 897-910, Oct. 2005.
[24] C.M. Bishop and M.E. Tipping, “Bayesian Regression and Classification,” Advances in Learning Theory: Methods, Models and Applications, J. Suykens et al., eds., vol. 190, pp. 267-285, IOS Press, NATO Science Series III: Computer and Systems Sciences, 2003.
[25] G.E.P. Box and G.C. Tiao, Bayesian Inference in Statistical Analysis. John Wiley and Sons, 1992.
[26] J.O. Berger, Statistical Decision Theory and Bayesian Analysis, second ed. Springer-Verlag, 1993.
[27] D.J. Spiegelhalter, N.G. Best, B.P. Carlin, and A. van der Linde, “Bayesian Measures of Model Complexity and Fit,” J. Royal Statistical Soc., vol. 64, no. 3, pp. 583-639, 2002.
[28] T.L. Graves, A.F. Karr, J.S. Marron, and H. Siy, “Predicting Fault Incidence Using Software Change History,” IEEE Trans. Software Eng., vol. 26, no. 7, pp. 653-661, July 2000.
[29] J. Gras, “End-to-End Defect Modeling,” IEEE Software, vol. 21, no. 5, pp. 98-100, Sept./Oct. 2004.
[30] E.P. Minana and J. Gras, “Improving Fault Prediction Using Bayesian Networks for the Development of Embedded Software Applications,” Software Testing, Verification, and Reliability, vol. 16, no. 3, pp. 157-174, 2006.
[31] H. Abdi, “Partial Least Square Regression (PLS Regression),” Encyclopedia of Measurement and Statistics, N.J. Salkind, ed. pp. 740-744, Sage Publications, 2007.

Index Terms:
Bayesian analysis, Bayesian networks, defects, fault proneness, metrics, object-oriented, regression, software quality
Ganesh J. Pai, Joanne Bechta Dugan, "Empirical Analysis of Software Fault Content and Fault Proneness Using Bayesian Methods," IEEE Transactions on Software Engineering, vol. 33, no. 10, pp. 675-686, Oct. 2007, doi:10.1109/TSE.2007.70722
Usage of this product signifies your acceptance of the Terms of Use.