This Article 
 Bibliographic References 
 Add to: 
What Types of Defects Are Really Discovered in Code Reviews?
May/June 2009 (vol. 35 no. 3)
pp. 430-448
Mika V. Mäntylä, Helsinki University of Technology, TKK
Casper Lassenius, Helsinki University of Technology, TKK
Research on code reviews has often focused on defect counts instead of defect types, which offers an imperfect view of code review benefits. In this paper, we classified the defects of nine industrial (C/C++) and 23 student (Java) code reviews, detecting 388 and 371 defects, respectively. First, we discovered that 75 percent of defects found during the review do not affect the visible functionality of the software. Instead, these defects improved software evolvability by making it easier to understand and modify. Second, we created a defect classification consisting of functional and evolvability defects. The evolvability defect classification is based on the defect types found in this study, but, for the functional defects, we studied and compared existing functional defect classifications. The classification can be useful for assigning code review roles, creating checklists, assessing software evolvability, and building software engineering tools. We conclude that, in addition to functional defects, code reviews find many evolvability defects and, thus, offer additional benefits over execution-based quality assurance methods that cannot detect evolvability defects. We suggest that code reviews may be most valuable for software products with long life cycles as the value of discovering evolvability defects in them is greater than for short life cycle systems.

[1] AFOTEC, Software Maintainability Evaluation Guide. Dept. of the Air Force, HQ Air Force Operational Test and Evaluation Center, 1996.
[2] R.S. Arnold, “Software Restructuring,” Proc. IEE, vol. 77, no. 4, pp.607-617, 1989.
[3] A. Aurum, H. Petersson, and C. Wohlin, “State-of-the-Art: Software Inspections after 25 Years,” Software Testing, Verification, and Reliability, vol. 12, no. 3, pp. 133-154, 2002.
[4] R.K. Bandi, V.K. Vaishnavi, and D.E. Turk, “Predicting Maintenance Performance Using Object-Oriented Design Complexity Metrics,” IEEE Trans. Software Eng., vol. 29, no. 1, pp. 77-87, Jan. 2003.
[5] R.D. Banker, S.M. Datar, C.F. Kemerer, and D. Zweig, “Software Complexity and Maintenance Costs,” Comm. ACM, vol. 36, no. 11, pp. 81-94, 1993.
[6] V.R. Basili, S. Green, O. Laitenberger, F. Lanubile, F. Shull, S. Sørumgård, and M.V. Zelkowitz, “The Empirical Investigation of Perspective-Based Reading,” Empirical Software Eng., vol. 1, no. 2, pp. 133-164, 1996.
[7] V.R. Basili and R.W. Selby, “Comparing the Effectiveness of Software Testing Strategies,” IEEE Trans. Software Eng., vol. 13, no. 12, pp. 1278-1296, Dec. 1987.
[8] K. Beck, Test-Driven Development by Example. Addison-Wesley, 2002.
[9] K. Beck, Extreme Programming Explained. Addison-Wesley, 2000.
[10] B. Beizer, Software Testing Techniques. Van Nostrand Reinhold, 1990.
[11] T. Berling and T. Thelin, “An Industrial Case Study of the Verification and Validation Activities,” Proc. Ninth Int'l Software Metrics Symp., pp. 226-238, 2003.
[12] B. Boehm and V.R. Basili, “Top 10 List [Software Development],” Computer, vol. 34, no. 1, pp. 135-137, Jan. 2001.
[13] L.C. Briand, C. Bunse, and J.W. Daly, “A Controlled Experiment for Evaluating Quality Guidelines on the Maintainability of Object Oriented Designs,” IEEE Trans. Software Eng., vol. 27, no. 6, pp.513-530, June 2001.
[14] I. Burnstein, Practical Software Testing. Springer, 2002.
[15] D.T. Campbell and J.C. Stanley, Experimental and Quasi-Experimental Design for Research. Rand McNally College, 1966.
[16] J.K. Chaar, M.J. Halliday, I.S. Bhandari, and R. Chillarege, “In-Process Evaluation for Software Inspection and Test,” IEEE Trans. Software Eng., vol. 19, no. 11, pp. 1055-1070, Nov. 1993.
[17] S.R. Chidamber, D.P. Darcy, and C.F. Kemerer, “Managerial Use of Metrics for Object-Oriented Software: An Exploratory Analysis,” IEEE Trans. Software Eng., vol. 24, no. 8, pp. 629-639, Aug. 1998.
[18] R. Chillarege, I.S. Bhandari, J.K. Chaar, M.J. Halliday, D.S. Moebus, B.K. Ray, and M.-. Wong, “Orthogonal Defect Classification—A Concept for In-Process Measurements,” IEEE Trans. Software Eng., vol. 18, no. 11, pp. 943-956, Nov. 1992.
[19] P. Clements, R. Kazman, and M. Klein, Evaluating Software Architectures: Methods and Case Studies. Addison-Wesley, 2002.
[20] T.D. Cook and D.T. Campbell, Quasi-Experimentation: Design and Analysis Issues for Field Settings. Rand McNally College, 1979.
[21] W. Cunningham, “The WyCash Portfolio Management System,” Proc. Seventh Ann. Conf. Object-Oriented Programming Systems, Languages, and Applications, pp. 29-30 (addendum), 1992.
[22] M.A. Cusumano and R.W. Selby, Microsoft Secrets. The Free Press, 1995.
[23] D.P. Darcy, C.F. Kemerer, S.A. Slaughter, and J.E. Tomayko, “The Structural Complexity of Software: An Experimental Test,” IEEE Trans. Software Eng., vol. 31, no. 11, pp. 982-995, Nov. 2005.
[24] K. El Emam and O. Laitenberger, “A Comprehensive Evaluation of Capture-Recapture Models for Estimating Software Defect Content,” IEEE Trans. Software Eng., vol. 26, no. 6, pp. 518-540, June 2000.
[25] K. El Emam and I. Wieczorek, “The Repeatability of Code Defect Classifications,” Proc. Ninth Int'l Symp. Software Reliability Eng., pp.322-333, 1998.
[26] M.E. Fagan, “Design and Code Inspections to Reduce Errors in Program Development,” IBM System J., vol. 15, no. 4, pp. 182-211, 1976.
[27] M. Fowler, Refactoring: Improving the Design of Existing Code. Addison-Wesley, 2000.
[28] D.A. Garvin, “What Does 'Product Quality' Really Mean?” Sloan Management Rev., vol. 26, no. 1, pp. 25-43, Fall 1984.
[29] T. Gilb and D. Graham, Software Inspection. Addison-Wesley, 1993.
[30] N. Gorla, A.C. Benander, and B.A. Benander, “Debugging Effort Estimation Using Software Metrics,” IEEE Trans. Software Eng., vol. 16, no. 2, pp. 223-231, Feb. 1990.
[31] R.B. Grady, Practical Software Metrics for Project Management and Process Improvement. Prentice Hall, 1992.
[32] W.S. Humphrey, A Discipline for Software Engineering. Addison-Wesley Longman, 1995.
[33] IEEE, IEEE Standard Classification for Software Anomalies, IEEE Std. 1044-1993, 1994.
[34] IEEE Standard Glossary of Software Engineering Terminology. IEEE, 1990.
[35] S. Jamieson, “Likert Scales: How to (Ab)Use Them,” Medical Education, vol. 38, no. 12, pp. 1217-1218, Dec. 2004.
[36] E. Kamsties and C.M. Lott, “An Empirical Evaluation of Three Defect-Detection Techniques,” Proc. Fifth European Software Eng. Conf., pp. 362-383, 1996.
[37] C. Kaner, J. Falk, and H.Q. Nguyen, Testing Computer Software. John Wiley & Sons, 1999.
[38] R. Kazman, M. Klein, and P. Clements, “ATAM: Method for Architecture Evaluation,” Technical Report CMU/SEI-2000-TR-004, 08/2000, 2000.
[39] B.A. Kitchenham and S.L. Pfleeger, “Software Quality: The Elusive Target,” IEEE Software, vol. 13, no. 1, pp. 12-21, 1996.
[40] B.A. Kitchenham, S.L. Pfleeger, L.M. Pickard, P.W. Jones, D.C. Hoaglin, K. El Emam, and J. Rosenberg, “Preliminary Guidelines for Empirical Research in Software Engineering,” IEEE Trans. Software Eng., vol. 28, no. 8, pp. 721-734, Aug. 2002.
[41] T.R. Knapp, “Treating Ordinal Scales as Interval Scales: An Attempt to Resolve the Controversy,” Nursing Research, vol. 39, no. 2, pp. 121-123, Mar.-Apr. 1990.
[42] O. Laitenberger, “Studying the Effects of Code Inspection and Structural Testing on Software Quality,” Proc. Ninth Int'l Symp. Software Reliability Eng., pp. 237-246, Nov. 1998.
[43] O. Laitenberger, M. Leszak, D. Stoll, and K. El Emam, “Quantitative Modeling of Software Reviews in an Industrial Setting,” Proc. Sixth Int'l Software Metrics Symp., pp. 312-322, 1999.
[44] O. Laitenberger and J. DeBaud, “An Encompassing Life Cycle Centric Survey of Software Inspection,” J. Systems and Software, vol. 50, no. 1, pp. 5-31, 2000.
[45] T.D. LaToza, G. Venolia, and R. DeLine, “Maintaining Mental Models: A Study of Developer Work Habits,” Proc. 28th Int'l Conf. Software Eng., pp. 492-501, 2006.
[46] W. Li and S.M. Henry, “Object-Oriented Metrics That Predict Maintainability,” J. Systems and Software, vol. 23, no. 2, pp. 111-122, 1993.
[47] C. Linnaeus and J.F. Gmelin, Systema Naturae per Regna Tria Naturae, Secundum Classes, Ordines, Genera, Species, cum Characteribus, Differentiis, Synonymis, Locis. Laurentius Salvius, 1758.
[48] J.C. Maldonado, J. Carver, F. Shull, S. Fabbri, E. Dória, L. Martimiano, M. Mendonça, and V. Basili, “Perspective-Based Reading: A Replicated Experiment Focused on Individual Reviewer Effectiveness,” Empirical Software Eng., vol. 11, no. 1, pp. 119-142, 2006.
[49] M.V. Mäntylä and C. Lassenius, “Drivers for Software Refactoring Decisions,” Proc. Int'l Symp. Empirical Software Eng., pp. 297-306, 2006.
[50] T. Mens and T. Tourwe, “A Survey of Software Refactoring,” IEEE Trans. Software Eng., vol. 30, no. 2, pp. 126-139, Feb. 2004.
[51] R.J. Miara, J.A. Musselman, J.A. Navarro, and B. Shneiderman, “Program Indentation and Comprehensibility,” Comm. ACM, vol. 26, no. 11, pp. 861-867, 1983.
[52] M.B. Miles and M.A. Huberman, Qualitative Data Analysis. Sage Publications, 1994.
[53] T. Moilanen and S. Roponen, Kvalitativiisen Aineiston Analyysi Atlas.Ti-Ohjelman Avulla (“Analyzing Qualitative Data with Atlas.Ti Software). Kuluttajatutkimuskeskus, 1994.
[54] G.C. Murphy, M. Kersten, and L. Findlater, “How Are Java Software Developers Using the Eclipse IDE?” IEEE Software, vol. 23, no. 4, pp. 76-83, July/Aug. 2006.
[55] G.J. Myers, “A Controlled Experiment in Program Testing and Code Walkthroughs/Inspections,” Comm. ACM, vol. 21, no. 9, pp.760-768, 1978.
[56] I. Niiniluoto, “Käsitetyypit Ja Mittaaminen (“Concept Types and Measurement”),” Johdatus Tieteenfilosofiaan: Käsitteen Ja Teorianmuodostus (“Introduction to Philosophy of Science: Theory Building and Conception”), third ed., pp. 171-191. Otava, 1980.
[57] P.W. Oman and C.R. Cook, “Typographic Style Is More than Cosmetic,” Comm. ACM, vol. 33, no. 5, pp. 506-520, 1990.
[58] D. O'Neill, National Software Quality Experiment Resources and Results, accessed 2007 06/13, Donnsqe-results.html , 2002.
[59] D.L. Parnas and D.M. Weiss, “Active Design Reviews: Principles and Practices,” Proc. Eighth Int'l Conf. Software Eng., pp. 132-136, 1985.
[60] A.A. Porter, L.G. Votta Jr., and V.R. Basili, “Comparing Detection Methods for Software Requirements Inspections: A Replicated Experiment,” IEEE Trans. Software Eng., vol. 21, no. 6, pp. 563-575, June 1995.
[61] B. Regnell, P. Runeson, and T.E. Thelin, “Are the Perspectives Really Different? Further Experimentation on Scenario-Based Reading of Requirements,” Empirical Software Eng., vol. 5, no. 4, pp. 331-356, Dec. 2000.
[62] D. Rombach, “Controlled Experiment on the Impact of Software Structure on Maintainability,” IEEE Trans. Software Eng., vol. 13, no. 3, pp. 344-354, Mar. 1987.
[63] P. Runeson and C. Wohlin, “An Experimental Evaluation of an Experience-Based Capture-Recapture Method in Software Code Inspections,” Empirical Software Eng., vol. 3, no. 4, pp. 381-406, 1998.
[64] P. Runeson, C. Andersson, T. Thelin, A. Andrews, and T. Berling, “What Do We Know about Defect Detection Methods?” IEEE Software, vol. 23, no. 3, pp. 82-90, May/June 2006.
[65] G.W. Russell, “Experience with Inspection in Ultralarge-Scale Development,” IEEE Software, vol. 8, no. 1, pp. 25-31, 1991.
[66] C.B. Seaman, “Qualitative Methods in Empirical Studies of Software Engineering,” IEEE Trans. Software Eng., vol. 25, no. 4, pp. 557-572, July/Aug. 1999.
[67] H. Siy and L. Votta, “Does the Modern Code Inspection Have Value?” Proc. Int'l Conf. Software Maintenance, pp. 281-289, 2001.
[68] S.S. So, S.D. Cha, T.J. Shimeall, and Y.R. Kwon, “An Empirical Evaluation of Six Methods to Detect Faults in Software,” Software Testing, Verification and Reliability, vol. 12, no. 3, pp. 155-171, 2002.
[69] T. Tenny, “Program Readability: Procedures versus Comments,” IEEE Trans. Software Eng., vol. 14, no. 9, pp. 1271-1279, Sept. 1988.
[70] K.E. Wiegers, Peer Reviews in Software. Addison-Wesley, 2002.
[71] M. Wood, M. Roper, A. Brooks, and J. Miller, “Comparing and Combining Software Defect Detection Techniques: A Replicated Empirical Study,” Proc. Sixth European Conf. Held Jointly with the Fifth ACM SIGSOFT Int'l Symp. Foundations of Software Eng., pp.262-277, 1997.
[72] Z. Xing and E. Stroulia, “Refactoring Practice: How It Is and How It Should Be Supported—An Eclipse Case Study,” Proc. 22nd IEEE Int'l Conf. Software Maintenance, pp. 458-468, 2006.
[73] Y. Zhou and H. Leung, “Empirical Analysis of Object-Oriented Design Metrics for Predicting High and Low Severity Faults,” IEEE Trans. Software Eng., vol. 32, no. 10, pp. 771-789, Oct. 2006.

Index Terms:
Code inspections and walkthroughs, enhancement, extensibility, maintainability, restructuring.
Mika V. Mäntylä, Casper Lassenius, "What Types of Defects Are Really Discovered in Code Reviews?," IEEE Transactions on Software Engineering, vol. 35, no. 3, pp. 430-448, May-June 2009, doi:10.1109/TSE.2008.71
Usage of this product signifies your acceptance of the Terms of Use.