The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.02 - Feb. (2013 vol.39)
pp: 237-257
Karel Dejaeger , Katholieke Universiteit Leuven, Leuven
Thomas Verbraken , Katholieke Universiteit Leuven, Leuven
Bart Baesens , Katholieke Universiteit Leuven, Leuven
ABSTRACT
Software testing is a crucial activity during software development and fault prediction models assist practitioners herein by providing an upfront identification of faulty software code by drawing upon the machine learning literature. While especially the Naive Bayes classifier is often applied in this regard, citing predictive performance and comprehensibility as its major strengths, a number of alternative Bayesian algorithms that boost the possibility of constructing simpler networks with fewer nodes and arcs remain unexplored. This study contributes to the literature by considering 15 different Bayesian Network (BN) classifiers and comparing them to other popular machine learning techniques. Furthermore, the applicability of the Markov blanket principle for feature selection, which is a natural extension to BN theory, is investigated. The results, both in terms of the AUC and the recently introduced H-measure, are rigorously tested using the statistical framework of Demšar. It is concluded that simple and comprehensible networks with less nodes can be constructed using BN classifiers other than the Naive Bayes classifier. Furthermore, it is found that the aspects of comprehensibility and predictive performance need to be balanced out, and also the development context is an item which should be taken into account during model selection.
INDEX TERMS
Software, Predictive models, Bayesian methods, Measurement, Capability maturity model, Probability distribution, Machine learning, comprehensibility, Software fault prediction, Bayesian networks, classification
CITATION
Karel Dejaeger, Thomas Verbraken, Bart Baesens, "Toward Comprehensible Software Fault Prediction Models Using Bayesian Network Classifiers", IEEE Transactions on Software Engineering, vol.39, no. 2, pp. 237-257, Feb. 2013, doi:10.1109/TSE.2012.20
REFERENCES
[1] S. Ali and K. Smith, "On Learning Algorithm Selection for Classification," Applied Soft Computing, vol. 6, no. 2, pp. 119-138, 2006.
[2] C. Aliferis, A. Statnikov, I. Tsamardinos, S. Mani, and X. Koutsoukos, "Local Causal and Markov Blanket Induction for Causal Discovery and Feature Selection for Classification Part I: Algorithms and Empirical Evaluation," The J. Machine Learning Research, vol. 11, pp. 171-234, 2010.
[3] C. Aliferis, I. Tsamardinos, and A. Statnikov, "HITON: A Novel Markov Blanket Algorithm for Optimal Variable Selection," Proc. AMIA Ann. Symp., 2003.
[4] E. Arisholm and L. Briand, "Predicting Fault-Prone Components in a Java Legacy System," Proc. ACM/IEEE Int'l Symp. Empirical Software Eng., 2006.
[5] I. Askira-Gelman, "Knowledge Discovery: Comprehensibility of the Results," Proc. 31st Ann. Hawaii Int'l Conf. System Sciences, vol. 5, pp. 247-256, 1998.
[6] D. Azar and J. Vybihal, "An Ant Colony Optimization Algorithm to Improve Software Quality Prediction Models: Case of Class Stability," Information and Software Technology, vol. 53, pp. 388-393, 2011.
[7] B. Baesens, T. Van Gestel, S. Viaene, M. Stepanova, J. Suykens, and J. Vanthienen, "Benchmarking State-of-the-Art Classification Algorithms for Credit Scoring," J. Operational Research Soc., vol. 54, no. 6, pp. 627-635, 2003.
[8] E. Baisch and T. Liedtke, "Comparison of Conventional Approaches and Soft-Computing Approaches for Software Quality Prediction," Proc. IEEE Int'l Conf. Systems, Man, and Cybernetics, vol. 2, pp. 1045-1049, 1997.
[9] M. Baojun, K. Dejaeger, J. Vanthienen, and B. Baesens, "Software Defect Prediction Based on Association Rule Classification," Proc. Int'l Conf. Electronic-Business Intelligence, pp. 396-402, 2010.
[10] B. Boehm, "A View of 20th and 21st Century Software Engineering," Proc. 28th Int'l Conf. Software Eng., pp. 12-29, 2006.
[11] B. Boehm and P. Papaccio, "Understanding and Controlling Software Costs," IEEE Trans. Software Eng., vol. 14, no. 10, pp. 1462-1477, Oct. 1988.
[12] L. Breiman, "Random Forests," Machine Learning, vol. 45, no. 1, pp. 5-32, 2001.
[13] E. Castillo, J. Gutiérrez, and A. Hadi, Expert Systems and Probabilistic Network Models. Springer Verlag, 1997.
[14] C. Catal, "Software Fault Prediction: A Literature Review and Current Trends," Expert Systems with Applications, vol. 38, pp. 4626-4636, 2011.
[15] C. Catal and B. Diri, "Investigating the Effect of Dataset Size, Metrics Sets, and Feature Selection Techniques on Software Fault Prediction Problem," Information Sciences, vol. 179, no. 8, pp. 1040-1058, 2009.
[16] C. Catal and B. Diri, "A Systematic Review of Software Fault Prediction Studies," Expert Systems with Applications, vol. 36, no. 4, pp. 7346-7354, 2009.
[17] C. Catal, U. Sevim, and B. Diri, "Practical Development of an Eclipse-Based Software Fault Prediction Tool Using Naive Bayes Algorithm," Expert Systems with Applications, vol. 38, pp. 2347-2353, 2011.
[18] J. Cheng, R. Greiner, J. Kelly, D. Bell, and W. Liu, "Learning Bayesian Networks from Data: An Information-Theory Based Approach," Artificial Intelligence, vol. 137, pp. 43-90, 2002.
[19] D. Chickering, "Optimal Structure Identification with Greedy Search," J. Machine Learning Research, vol. 3, pp. 507-554, 2002.
[20] D. Chickering, C. Meek, and D. Heckerman, "Large-Sample Learning of Bayesian Networks Is NP-Hard," J. Machine Learning Research, vol. 5, pp. 1287-1330, 2004.
[21] S. Chidamber and C. Kemerer, "A Metrics Suite for Object-Oriented Design," IEEE Trans. Software Eng., vol. 20, no. 6, pp. 476-493, June 1994.
[22] C. Chow and C. Liu, "Approximating Discrete Probability Distributions with Dependence Trees," IEEE Trans. Information Theory, vol. 14, no. 3, pp. 462-467, May 1968.
[23] G. Cooper and E. Herskovits, "A Bayesian Method for the Induction of Probabilistic Networks from Data," Machine Learning, vol. 9, pp. 309-347, 1992.
[24] V. Dallmeier and T. Zimmermann, "Extraction of Bug Localization Benchmarks from History," Proc. IEEE/ACM 22nd Int'l Conf. Automated Software Eng., pp. 433-436, 2007.
[25] K. Dejaeger, W. Verbeke, D. Martens, and B. Baesens, "Data Mining Techniques for Software Effort Estimation: A Comparative Study," IEEE Trans. Software Eng., vol. 38, no. 2, pp. 375-397, Mar./Apr. 2011.
[26] J. Demšar, "Statistical Comparison of Classifiers over Multiple Data Sets," J. Machine Learning Research, vol. 7, pp. 1-30, 2006.
[27] P. Domingos, "The Role of Occam's Razor in Knowledge Discovery," Data Mining and Knowledge Discovery, vol. 3, no. 4, pp. 409-425, 1999.
[28] P. Domingos and M. Pazzani, "On the Optimality of the Simple Bayesian Classifier under Zero-One Loss," Machine Learning, vol. 29, pp. 103-130, 1997.
[29] R. Duda and P. Hart, Pattern Classification and Scene Analysis. John Wiley, 1973.
[30] O.J. Dunn, "Multiple Comparisons among Means," J. Am. Statistical Assoc., vol. 56, pp. 52-64, 1961.
[31] K. Elish and M. Elish, "Predicting Defect-Prone Software Modules Using Support Vector Machines," J. Systems and Software, vol. 81, no. 5, pp. 649-660, 2008.
[32] M. Evett, T. Khoshgoftaar, P. Chien, and E. Allen, "GP-Based Software Quality Prediction," Proc. Third Ann. Conf. Genetic Programming, pp. 60-65, 1999.
[33] T. Fawcett, "An Introduction to ROC Analysis," Pattern Recognition Letters, vol. 27, no. 8, pp. 861-874, 2006.
[34] U. Fayyad and K. Irani, "Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning," Proc. Int'l Joint Conf. Uncertainty in Artificial Intelligence, pp. 1022-1027, 1993.
[35] N. Fenton and M. Neil, "Software Metrics: Successes, Failures and New Directions," J. Systems and Software, vol. 47, nos. 2/3, pp. 149-157, 1999.
[36] N. Fenton and M. Neil, "A Critique of Software Defect Prediction Models," IEEE Trans. Software Eng., vol. 25, no. 5, pp. 675-689, Sept./Oct. 1999.
[37] N. Fenton and S. Pfleeger, Software Metrics: A Rigorous & Practical Approach. PWS Publishing Company, 1998.
[38] M. Fischer, M. Pinzger, and H. Gall, "Populating a Release History Database from Version Control and Bug Tracking Systems," Proc. Int'l Conf. Software Maintenance, 2003.
[39] P. Flach, J. Hernández-Orallo, and C. Ferri, "A Coherent Interpretation of AUC as a Measure of Aggregated Classification Performance," Proc. 28th Int'l Conf. Machine Learning, 2011.
[40] N. Friedman, D. Geiger, and M. Goldszmidt, "Bayesian Network Classifiers," Machine Learning, vol. 29, pp. 131-163, 1997.
[41] S. Goedertier, J. De Weerdt, D. Martens, J. Vanthienen, and B. Baesens, "Process Discovery in Event Logs: An Application in the Telecom Industry," Applied Soft Computing, vol. 11, no. 2, pp. 1697-1710, 2011.
[42] S. Gokhale, "Architecture-Based Software Reliability Analysis: Overview and Limitations," IEEE Trans. Dependable and Secure Computing, vol. 4, no. 1, pp. 32-40, Jan.-Mar. 2007.
[43] I. Gondra, "Applying Machine Learning to Software Fault-Proneness Prediction," J. Systems and Software, vol. 81, pp. 186-195, 2008.
[44] J. Goodenough and S. Gerhart, "Toward a Theory of Test Data Selection," IEEE Trans. Software Eng., vol. 1, no. 2, pp. 156-173, Mar. 1975.
[45] L. Guo, Y. Ma, B. Cukic, and H. Singh, "Robust Prediction of Fault-Proness by Random Forests," Proc. 15th Int'l Symp. Software Reliability Eng., 2004.
[46] T. Gyimóthy, R. Ferenc, and I. Siket, "Empirical Validation of Object-Oriented Metrics on Open Source Software for Fault Prediction," IEEE Trans. Software Eng., vol. 31, no. 10, pp. 897-910, Oct. 2005.
[47] M. Halstead, Elements of Software Science. Elsevier, 1977.
[48] D. Hand, "Measuring Classifier Performance: A Coherent Alternative to the Area under the ROC Curve," Machine Learning, vol. 77, no. 1, pp. 103-123, 2009.
[49] D. Hand and K. Yu, "Idiot's Bayes Not So Stupid After All?" Int'l Statistical Rev., vol. 69, no. 3, pp. 385-398, 2001.
[50] M. Harrold, "Testing: A Roadmap," Proc. Conf. Future of Software Eng., pp. 61-72, 2000.
[51] D. Heckerman, D. Geiger, and D. Chickering, "Learning Bayesian Networks: The Combination of Kknowledge and Statistical Data," Machine Learning, vol. 20, pp. 194-243, 1995.
[52] R. Holte, "Very simple Classification Rules Perform Well on Most Commonly Used Datasets," Machine Learning, vol. 11, no. 1, pp. 63-90, 1993.
[53] J. Hudepohl, S. Aud, T. Khoshgoftaar, E. Allen, and J. Mayrand, "EMERALD: Software Metrics on the Desktop," IEEE Software, vol. 13, no. 5, pp. 56-60, Sept. 1996.
[54] J. Huysmans, K. Dejaeger, C. Mues, J. Vanthienen, and B. Baesens, "An Empirical Evaluation of the Comprehensibility of Decision Table, Tree and Rule Based Predictive Systems," Decision Support Systems, vol. 51, no. 1, pp. 141-154, 2011.
[55] Y. Jiang and B. Cukic, "Misclassification Cost-Sensitive Fault Prediction Models," Proc. Fifth Int'l Conf. Predictor Models in Software Eng., 2009.
[56] Y. Jiang, B. Cukic, and Y. Ma, "Techniques for Evaluating Fault Prediction Models," Empirical Software Eng., vol. 13, pp. 561-595, 2008.
[57] Y. Jiang, B. Cukic, and T. Menzies, "Fault Prediction Using Early Lifecycle Data," Proc. 18th IEEE Int'l Symp. Software Reliability, pp. 237-246, 2007.
[58] G. John and P. Langley, "Estimating Continuous Distributions in Bayesian Classifiers," Proc. 11th Conf. Uncertainty in Artificial Intelligence, pp. 338-345, 1995.
[59] M. Jørgensen and K. Moløkken-Østvold, "How Large Are Software Cost Overruns? A Review of the 1994 CHAOS Report," Information and Software Technology, vol. 48, pp. 297-301, 2006.
[60] T. Kamiya, S. Kusumoto, and K. Inoue, "Prediction of Fault-Proneness at Early Phase in Object-Oriented Development," Proc. Second IEEE Int'l Symp. Object-Oriented Real-Time Distributed Computing, 1999.
[61] S. Kanmani, V. Uthariaraj, V. Sankaranarayanan, and P. Thambidurai, "Object-Oriented Software Fault Prediction Using Neural Networks," Information and Software Technology, vol. 49, pp. 483-492, 2007.
[62] T. Khoshgoftaar, E. Allen, and J. Deng, "Using Regression Trees to Classify Fault-Prone Software Modules," IEEE Trans. Reliability, vol. 51, no. 4, pp. 455-462, Dec. 2002.
[63] T. Khoshgoftaar and N. Seliya, "Tree-Based Software Quality Estimation Models for Fault Prediction," Proc. IEEE Eighth Symp. Software Metrics, pp. 203-214, 2002.
[64] E. Kocaguneli, A. Tosun, B. Turhan, and B. Caglayan, "Prest: An Intelligent Software Metrics Extraction, Analysis and Defect Prediction Tool," Proc. Int'l Conf. Software Eng. and Knowledge Eng., 2009.
[65] I. Kononenko, "Semi-Naive Bayesian Classifier," Proc. Sixth European Working Session on Learning, pp. 206-219, 1991.
[66] S. Kotsiantis, S. Zaharakis, and P. Pintelas, "Supervised Machine Learning: A Review of Classification Techniques," Informatica, vol. 31, pp. 249-268, 2007.
[67] P. Langley and S. Sage, "Induction of Selective Bayesian Classifiers," Proc. 10th Conf. Uncertainty in Artificial Intelligence, D.P.R. Lopez de Mantaras, ed., pp. 399-406, 1994.
[68] P. Larranaga, C. Kuijpers, R. Murga, and Y. Yurramendi, "Learning Bayesian Network Structures by Searching for the Best Ordering with Genetic Algorithms," IEEE Trans. Systems, Man, and Cybernetics, Part A: Systems and Humans, vol. 26, no. 4, pp. 487-493, July 1996.
[69] E. Lehman and H. D'Abrera, Nonparametrics-Statistical Methods Based on Ranks. Holden-Day, 1975.
[70] S. Lessmann, B. Baesens, C. Mues, and S. Pietsch, "Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings," IEEE Trans. Software Eng., vol. 34, no. 4, pp. 485-496, July/Aug. 2008.
[71] H. Li and W. Cheung, "An Empirical Study of Software Metrics," IEEE Trans. Software Eng., vol. 13, no. 6, pp. 697-708, June 1987.
[72] S. Mahmood, R. Lai, Y. Soo Kim, J. Hong Kim, S. Cheon Park, and H. Suk Oh, "A Survey of Component Based System Quality Assurance and Assessment," Information and Software Technology, vol. 47, no. 10, pp. 693-707, 2005.
[73] A. Marcus, D. Poshyvanyk, and R. Ferenc, "Using the Conceptual Cohesion of Classes for Fault Prediction in Object-Oriented Systems," IEEE Trans. Software Eng., vol. 34, no. 2, pp. 287-300, Mar./Apr. 2008.
[74] T. Menzies, J. Greenwald, and A. Frank, "Data Mining Static Code Attributes to Learn Defect Predictors," IEEE Trans. Software Eng., vol. 32, no. 11, pp. 2-13, Nov. 2007.
[75] T. Menzies, Z. Milton, B. Turhan, B. Cukic, Y. Jiang, and A. Bener, "Defect Prediction From Static Code Features: Current Results, Limitations, New Approaches," Automated Software Eng., pp. 1-33, 2010.
[76] P.B. Nemenyi, "Distribution-Free Multiple Comparisons," PhD dissertation, Princeton Univ., 1963.
[77] T. Ostrand, E. Weyuker, and R. Bell, "Where the Bugs Are," ACM SIGSOFT Software Eng. Notes, vol. 29, no. 4, pp. 86-96, 2004.
[78] T. Ostrand, E. Weyuker, and R. Bell, "Predicting the Location and Number of Faults in Large Software Systems," IEEE Trans. Software Eng., vol. 31, no. 4, pp. 340-355, Apr. 2005.
[79] J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks for Plausible Inference. Morgan Kaufmann, 1988.
[80] T.-S. Quah and M. Thwin, "Application of Neural Networks for Software Quality Prediction Using Object-Oriented Metrics," Proc. Int'l Conf. Software Maintenance, 2003.
[81] E. Raymond, "The Cathedral and the Bazaar," Knowledge, Technology and Policy, vol. 12, no. 3, pp. 23-49, 1999.
[82] W. Royce, "Managing the Development of Large Software Systems," Proc. IEEE WESCON, pp. 1-9, 1970.
[83] J. Sacha, "New Synthesis of Bayesian Network Classifiers and Cardiac SPECT Image Interpretation," PhD. dissertation, Univ. Toledo, 1999.
[84] V. Sessions and M. Valtorta, "Towards a Method for Data Accuracy Assessment Utilizing a Bayesian Network Learning Algorithm," J. Data and Information Quality, vol. 1, no. 3, pp. 1-34, 2009.
[85] M. Shepperd, "A Critique of Cyclomatic Complexity as a Software Metric," Software Eng. J., vol. 3, no. 2, pp. 30-36, 1988.
[86] M. Shepperd and D. Ince, "A Critique of Three Metrics," J. Systems and Software, vol. 26, no. 3, pp. 197-210, 1994.
[87] S. Sherer, "Software Fault Prediction," J. Systems and Software, vol. 29, no. 2, pp. 97-105, 1995.
[88] F. Shull, V. Basili, B.B.A. Brown, P. Costa, M. Lindvall, D. Port, I. Rus, R. Tesoriero, and M. Zelkowitz, "What We Have Learned about Fighting Defects," Proc. Eighth IEEE Symp. Software Metrics, pp. 249-258, 2002.
[89] P. Spirtes, C. Glymour, and R. Scheines, Causation, Prediction, and Search, second ed. The MIT Press, 2000.
[90] P.-N. Tan, M. Steinbach, and V. Kumar, Introduction to Data Mining. Pearson Addison Wesley, 2006.
[91] B. Todd and R. Stamper, "The Relative Accuracy of a Variety of Medical Diagnostic Programs," Methods of Information in Medicine, vol. 33, pp. 402-416, 1994.
[92] P. Tomaszewski, J. Hakansson, H. Grahn, and L. Lundberg, "Statistical Models vs. Expert Estimation for Fault Prediction in Modified Code-An Industrial Case Study," J. Systems and Software, vol. 80, no. 8, pp. 1227-1238, 2007.
[93] A. Tosun, A. Bener, and B. Turhan, "An Industrial Case Study of Classifier Ensembles for Locating Software Defects," Software Quality J., pp. 1-22, 2011.
[94] A. Tosun, B. Turhan, and A. Bener, "Validation of Network Measures as Indicators of Defective Modules in Software Systems," Proc. Fifth Int'l Conf. Predictor Models in Software Eng., 2009.
[95] I. Tsamardinos, L. Brown, and C. Aliferis, "The Max-Min Hill-Climbing Bayesian Network Structure Learning Algorithm," Machine Learning, vol. 65, no. 1, pp. 31-78, 2006.
[96] B. Turhan and A. Bener, "Software Defect Prediction: Heuristics for Weighted Naive Bayes," Proc. Second Int'l Conf. Software and Data Technologies, pp.244-249, 2007.
[97] B. Turhan and A. Bener, "Analysis of Naive Bayes' Assumptions on Software Fault Data: An Empirical Study," Data & Knowledge Eng., vol. 68, no. 2, pp. 278-290, 2009.
[98] B. Turhan, G. Kocak, and A. Bener, "Software Defect Prediction Using Call Graph Based Ranking (CGBR) Framework," Proc. 34th Euromicro Conf. Software Eng. and Advanced Applications, 2008.
[99] B. Turhan, T. Menzies, A. Bener, and J. Di Stefano, "On the Relative Value of Cross-Company and Within-Company Data for Defect Prediction," Empirical Software Eng., vol. 14, no. 5, pp. 540-578, 2009.
[100] O. Vandecruys, D. Martens, B. Baesens, C. Mues, M. De Backer, and R. Haesen, "Mining Software Repositories for Comprehensible Software Fault Prediction Models," J. Systems and Software, vol. 81, no. 5, pp. 823-839, 2008.
[101] W. Verbeke, D. Martens, C. Mues, and B. Baesens, "Building Comprehensible Customer Churn Prediction Models with Advanced Rule Induction Techniques," Expert Systems with Applications, vol. 38, pp. 2354-2364, 2011.
[102] T. Verbraken, W. Verbeke, and B. Baesens, "Profit Optimizing Customer Churn Prediction with Bayesian Network Classifiers," Intelligent Data Analysis, vol. In press, 2011.
[103] I. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann Publishers, 2005.
[104] X. Wu, V. Kumar, R. Quinlan, J. Gosh, Q. Yang, H. Motoda, G. McLachlan, A. Ng, B. Liu, P. Yu, Z.-H. Zhou, M. Steinbach, D. Hand, and D. Steinbach, "Top 10 Algorithms in Data Mining," Knowledge and Information Systems, vol. 14, pp. 1-37, 2008.
[105] J. Yu, V. Smith, P. Wang, A. Hartemink, and E. Jarvis, "Using Bayesian Network Inference Algorithms to Recover Molecular Genetic Regulatory Networks," Proc. Third Int'l Conf. Systems Biology, 2002.
[106] X. Yuan, T. Khoshgoftaar, E. Allen, and K. Ganesan, "An Application of Fuzzy Clustering to Software Quality Prediction," Proc. Third IEEE Symp. Application-Specific Systems and Software Eng. Technology, pp. 85-90, 2000.
[107] H. Zhang, "On the Distribution of Software Faults," IEEE Trans. Software Eng., vol. 34, no. 2, pp. 301-302, Mar./Apr. 2008.
[108] T. Zimmermann, N. Nagappan, H. Gall, E. Giger, and B. Murphy, "Cross-Project Defect Prediction," Proc. Symp. Foundations of Software Eng., 2009.
[109] T. Zimmermann, R. Premraj, and A. Zeller, "Predicting Defects for Eclipse," Proc. Third Int'l Workshop Predictor Models in Software Eng., 2007.
37 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool