Issue No.04 - April (2013 vol.39)
pp: 552-569
S. Shivaji , Dept. of Comput. Sci., Univ. of California, Santa Cruz, Santa Cruz, CA, USA
E. James Whitehead , Dept. of Comput. Sci., Univ. of California, Santa Cruz, Santa Cruz, CA, USA
R. Akella , Technol. & Inf. Manage. Program, Univ. of California, Santa Cruz, Santa Cruz, CA, USA
Sunghun Kim , Dept. of Comput. Sci. & Eng., Hong Kong Univ. of Sci. & Technol., Kowloon, China
Machine learning classifiers have recently emerged as a way to predict the introduction of bugs in changes made to source code files. The classifier is first trained on software history, and then used to predict if an impending change causes a bug. Drawbacks of existing classifier-based bug prediction techniques are insufficient performance for practical use and slow prediction times due to a large number of machine learned features. This paper investigates multiple feature selection techniques that are generally applicable to classification-based bug prediction methods. The techniques discard less important features until optimal classification performance is reached. The total number of features used for training is substantially reduced, often to less than 10 percent of the original. The performance of Naive Bayes and Support Vector Machine (SVM) classifiers when using this technique is characterized on 11 software projects. Naive Bayes using feature selection provides significant improvement in buggy F-measure (21 percent improvement) over prior change classification bug prediction results (by the second and fourth authors [28]). The SVM's improvement in buggy F-measure is 9 percent. Interestingly, an analysis of performance for varying numbers of features shows that strong performance is achieved at even 1 percent of the original number of features.
support vector machines, belief networks, learning (artificial intelligence), pattern classification, program debugging, buggy F-measure, code change-based bug prediction, machine learning classifier, source code file, software history, classifier-based bug prediction, machine learned feature reduction, feature selection technique, classification performance, naive Bayes classifier, support vector machine, SVM classifier, software project, Software, Support vector machines, History, Machine learning, Feature extraction, Measurement, Computer bugs, feature selection, Reliability, bug prediction, machine learning
S. Shivaji, E. James Whitehead, R. Akella, Sunghun Kim, "Reducing Features to Improve Code Change-Based Bug Prediction", IEEE Transactions on Software Engineering, vol.39, no. 4, pp. 552-569, April 2013, doi:10.1109/TSE.2012.43
[1] A. Ahmad and L. Dey, "A Feature Selection Technique for Classificatory Analysis," Pattern Recognition Letters, vol. 26, no. 1, pp. 43-56, 2005.
[2] E. Alpaydin, Introduction to Machine Learning. MIT Press, 2004.
[3] A. Anagnostopoulos, A. Broder, and K. Punera, "Effective and Efficient Classification on a Search-Engine Model," Proc. 15th ACM Int'l Conf. Information and Knowledge Management, Jan. 2006.
[4] L. Aversano, L. Cerulo, and C. Del Grosso, "Learning from Bug-Introducing Changes to Prevent Fault Prone Code," Proc. Foundations of Software Eng., pp. 19-26, 2007.
[5] A. Bachmann, C. Bird, F. Rahman, P. Devanbu, and A. Bernstein, "The Missing Links: Bugs and Bug-Fix Commits," Proc. 18th ACM SIGSOFT Int'l Symp. Foundations of Software Eng., pp. 97-106, 2010.
[6] A. Bessey, K. Block, B. Chelf, A. Chou, B. Fulton, S. Hallem, C. Henri-Gros, A. Kamsky, S. McPeak, and D.R. Engler, "A Few Billion Lines of Code Later: Using Static Analysis to Find Bugs in the Real World," Comm. ACM, vol. 53, no. 2, pp. 66-75, 2010.
[7] J. Bevan, E. WhiteheadJr., S. Kim, and M. Godfrey, "Facilitating Software Evolution Research with Kenyon," Proc. 10th European Software Eng. Conf. Held Jointly with 13th ACM SIGSOFT Int'l Symp. Foundations of Software Eng., pp. 177-186, 2005.
[8] C. Bird, A. Bachmann, E. Aune, J. Duffy, A. Bernstein, V. Filkov, and P. Devanbu, "Fair and Balanced?: Bias in Bug-Fix Datasets," Proc. Seventh Joint Meeting of the European Software Eng. Conf. and the ACM SIGSOFT Symp. the Foundations of Software Eng., pp. 121-130, 2009.
[9] Z. Birnbaum and F. Tingey, "One-Sided Confidence Contours for Probability Distribution Functions," The Annals of Math. Statistics, vol. 22, pp. 592-596, 1951.
[10] L.C. Briand, J. Wiist, S.V. Ikonomovski, and H. Lounis, "Investigating Quality Factors in Object-Oriented Designs: An Industrial Case Study," Proc. Int'l Conf. Software Eng., pp. 345-354, 1999.
[11] V. Challagulla, F. Bastani, I. Yen, and R. Paul, "Empirical Assessment of Machine Learning Based Software Defect Prediction Techniques," Proc. IEEE 10th Int'l Workshop Object-Oriented Real-Time Dependable Systems, pp. 263-270, 2005.
[12] M. D'Ambros, M. Lanza, and R. Robbes, "Evaluating Defect Prediction Approaches: A Benchmark and an Extensive Comparison," Empirical Software Eng., pp. 1-47, 2011.
[13] B. Efron and R. Tibshirani, An Introduction to the Bootstrap. Chapman & Hall/CRC, 1993.
[14] K. Elish and M. Elish, "Predicting Defect-Prone Software Modules Using Support Vector Machines," J. Systems and Software, vol. 81, no. 5, pp. 649-660, 2008.
[15] R. Fan, K. Chang, C. Hsieh, X. Wang, and C. Lin, "Liblinear: A Library for Large Linear Classification," J. Machine Learning Research, vol. 9, pp. 1871-1874, 2008.
[16] T. Fawcett, "An Introduction to ROC Analysis," Pattern Recognition Letters, vol. 27, no. 8, pp. 861-874, 2006.
[17] J. Friedman, T. Hastie, and R. Tibshirani, The Elements of Statistical Learning, Springer Series in Statistics, vol. 1. Springer, 2001.
[18] K. Gao, T. Khoshgoftaar, H. Wang, and N. Seliya, "Choosing Software Metrics for Defect Prediction: An Investigation on Feature Selection Techniques," Software: Practice and Experience, vol. 41, no. 5, pp. 579-606, 2011.
[19] T. Gyimóthy, R. Ferenc, and I. Siket, "Empirical Validation of Object-Oriented Metrics on Open Source Software for Fault Prediction," IEEE Trans. Software Eng., vol. 31, no. 10, pp. 897-910, Oct. 2005.
[20] M. Hall and G. Holmes, "Benchmarking Attribute Selection Techniques for Discrete Class Data Mining," IEEE Trans. Knowledge and Data Eng., vol. 15, no. 6, pp. 1437-1447, Nov./Dec. 2003.
[21] A. Hassan and R. Holt, "The Top Ten List: Dynamic Fault Prediction," Proc. 21st IEEE Int'l Conf. Software Maintenance, Jan. 2005.
[22] H. Hata, O. Mizuno, and T. Kikuno, "An Extension of Fault-Prone Filtering Using Precise Training and a Dynamic Threshold," Proc. Int'l Working Conf. Mining Software Repositories, 2008.
[23] "Maintenance, Understanding, Metrics and Documentation Tools for Ada, C, C++, Java, and Fortran," http:/, 2005.
[24] T. Joachims, "Text Categorization with Support Vector Machines: Learning with Many Relevant Features," Proc. 10th European Conf. Machine Learning, pp. 137-142, 1998.
[25] T. Joachims, "Training Linear SVMs in Linear Time," Proc. 12th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, p. 226, 2006.
[26] T. Khoshgoftaar and E. Allen, "Predicting the Order of Fault-Prone Modules in Legacy Software," Proc. Int'l Symp. Software Reliability Eng., pp. 344-353, 1998.
[27] T. Khoshgoftaar and E. Allen, "Ordering Fault-Prone Software Modules," Software Quality J., vol. 11, no. 1, pp. 19-37, 2003.
[28] S. Kim, E. WhiteheadJr., and Y. Zhang, "Classifying Software Changes: Clean or Buggy?" IEEE Trans. Software Eng., vol. 34, no. 2, pp. 181-196, Mar./Apr. 2008.
[29] S. Kim, T. Zimmermann, E. WhiteheadJr., and A. Zeller, "Predicting Faults from Cached History," Proc. 29th Int'l Conf. Software Eng, pp. 489-498, 2007.
[30] S. Kim, T. Zimmermann, E. WhiteheadJr., and A. Zeller, "Predicting Faults from Cached History," Proc. 29th Int'l Conf. Software Eng., pp. 489-498, 2007.
[31] I. Kononenko, "Estimating Attributes: Analysis and Extensions of Relief," Proc. European Conf. Machine Learning, pp. 171-182, 1994.
[32] B. Larsen and C. Aone, "Fast and Effective Text Mining Using Linear-Time Document Clustering," Proc. Fifth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 16-22, 1999.
[33] S. Lessmann, B. Baesens, C. Mues, and S. Pietsch, "Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings," IEEE Trans. Software Eng., vol. 34, no. 4, pp. 485-496, July/Aug. 2008.
[34] D. Lewis, "Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval," Proc. 10th European Conf. Machine Learning, pp. 4-15, 1998.
[35] H. Liu and H. Motoda, Feature Selection for Knowledge Discovery and Data Mining. Springer, 1998.
[36] J. Madhavan and E. WhiteheadJr., "Predicting Buggy Changes Inside an Integrated Development Environment," Proc. OOPSLA Workshop Eclipse Technology eXchange, 2007.
[37] F. MasseyJr., "The Kolmogorov-Smirnov Test for Goodness of Fit," J. Am. Statistical Assoc., vol. 46, pp. 68-78, 1951.
[38] A. McCallum and K. Nigam, "A Comparison of Event Models for Naive Bayes Text Classification," Proc. AAAI Workshop Learning for Text Categorization, Jan. 1998.
[39] T. Menzies, J. Greenwald, and A. Frank, "Data Mining Static Code Attributes to Learn Defect Predictors," IEEE Trans. Software Eng., vol. 33, no. 1, pp. 2-13, Jan. 2007.
[40] A. Mockus and L. Votta, "Identifying Reasons for Software Changes Using Historic Databases," Proc. Int'l Conf. Software Maintenance, pp. 120-130, 2000.
[41] A. Mockus and D. Weiss, "Predicting Risk of Software Changes," Bell Labs Technical J., vol. 5, no. 2, pp. 169-180, 2000.
[42] S. Morasca and G. Ruhe, "A Hybrid Approach to Analyze Empirical Software Engineering Data and Its Application to Predict Module Fault-Proneness in Maintenance," J. Systems Software, vol. 53, no. 3, pp. 225-237, 2000.
[43] R. Moser, W. Pedrycz, and G. Succi, "A Comparative Analysis of the Efficiency of Change Metrics and Static Code Attributes for Defect Prediction," Proc. 30th ACM/IEEE Int'l Conf. Software Eng., pp. 181-190, 2008.
[44] T. Ostrand, E. Weyuker, and R. Bell, "Predicting the Location and Number of Faults in Large Software Systems," IEEE Trans. Software Eng., vol. 31, no. 4, pp. 340-355, Apr. 2005.
[45] J. Quinlan, C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.
[46] M. Robnik-Šikonja and I. Kononenko, "Theoretical and Empirical Analysis of Relieff and RRelieff," Machine Learning, vol. 53, no. 1, pp. 23-69, 2003.
[47] S. Scott and S. Matwin, "Feature Engineering for Text Classification," Proc. Machine Learning Int'l Workshop, pp. 379-388, 1999.
[48] E. Shihab, Z. Jiang, W. Ibrahim, B. Adams, and A. Hassan, "Understanding the Impact of Code and Process Metrics on Post-Release Defects: A Case Study on the Eclipse Project," Proc. ACM/IEEE Int'l Symp. Empirical Software Eng. and Measurement, pp. 1-10, 2010.
[49] S. Shivaji, E. WhiteheadJr., R. Akella, and S. Kim, "Reducing Features to Improve Bug Prediction," Proc. IEEE/ACM Int'l Conf. Automated Software Eng., pp. 600-604, 2009.
[50] J. Śliwerski, T. Zimmermann, and A. Zeller, "When Do Changes Induce Fixes?" Proc. Int'l Workshop Mining Software Repositories, pp. 24-28, 2005.
[51] Q. Song, Z. Jia, M. Shepperd, S. Ying, and J. Liu, "A General Software Defect-Proneness Prediction Framework," IEEE Trans. Software Eng., vol. 37, no. 3, pp. 356-370, May/June 2011.
[52] H.K. Wright, M. Kim, and D.E. Perry, "Validity Concerns in Software Engineering Research," Proc. FSE/SDP Workshop Future of Software Eng. Research, pp. 411-414, 2010.
[53] T. Zimmermann and P. Weißgerber, "Preprocessing CVS Data for Fine-Grained Analysis," Proc. First Int'l Workshop Mining Software Repositories, pp. 2-6, 2004.