loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Using the Conceptual Cohesion of Classes for Fault Prediction in Object-Oriented Systems
March/April 2008 (vol. 34 no. 2)
pp. 287-300
High cohesion is a desirable property of software, as it positively impacts understanding, reuse, and maintenance. Currently proposed measures for cohesion in Object-Oriented (OO) software reflect particular interpretations of cohesion and capture different aspects of cohesion. The paper proposes a new measure for the cohesion of classes in an OO software system, based on the analysis of the unstructured information embedded in the source code, such as comments and identifiers. The measure, named the Conceptual Cohesion of Classes (C3), is inspired from the mechanisms used to measure textual coherence in cognitive psychology and computational linguistics. The paper presents the principles and the technology that stand behind the C3 measure. A large case study on three open source software systems is presented, which compares the new measure with an extensive set of existing metrics and uses them to construct models that predict software faults. The case study shows that the novel measure captures different aspects of class cohesion compared to any of the existing cohesion measures. In addition, combining C3 with existing structural cohesion metrics proves to be a better predictor of faulty classes when compared to different combinations of structural cohesion metrics.

[1] E.B. Allen, T.M. Khoshgoftaar, and Y. Chen, “Measuring Coupling and Cohesion of Software Modules: An Information-Theory Approach,” Proc. Seventh IEEE Int'l Software Metrics Symp., pp.124-134, Apr. 2001.
[2] G. Antoniol, G. Canfora, G. Casazza, and A. De Lucia, “Identifying the Starting Impact Set of a Maintenance and Reengineering,” Proc. Fourth European Conf. Software Maintenance, pp. 227-230, 2000.
[3] G. Antoniol, G. Canfora, G. Casazza, A. De Lucia, and E. Merlo, “Recovering Traceability Links between Code and Documentation,” IEEE Trans. Software Eng., vol. 28, no. 10, pp. 970-983, Oct. 2002.
[4] E. Arisholm, L.C. Briand, and A. Foyen, “Dynamic Coupling Measurement for Object-Oriented Software,” IEEE Trans. Software Eng., vol. 30, no. 8, pp. 491-506, Aug. 2004.
[5] J. Bansiya and C.G. Davis, “A Hierarchical Model for Object-Oriented Design Quality Assessment,” IEEE Trans. Software Eng., vol. 28, no. 1, pp. 4-17, Jan. 2002.
[6] V.R. Basili, L.C. Briand, and W.L. Melo, “A Validation of Object-Oriented Design Metrics as Quality Indicators,” IEEE Trans. Software Eng., vol. 22, no. 10, pp. 751-761, Oct. 1996.
[7] M.W. Berry, “Large Scale Singular Value Computations,” Int'l J. Supercomputer Applications, vol. 6, pp. 13-49, 1992.
[8] J. Bieman and B.-K. Kang, “Cohesion and Reuse in an Object-Oriented System,” Proc. Symp. Software Reusability, pp. 259-262, Apr. 1995.
[9] L. Briand, W. Melo, and J. Wust, “Assessing the Applicability of Fault-Proneness Models Across Object-Oriented Software Projects,” IEEE Trans. Software Eng., vol. 28, no. 7, pp. 706-720, July 2002.
[10] L.C. Briand, J.W. Daly, V. Porter, and J. Wüst, “A Comprehensive Empirical Validation of Design Measures for Object-Oriented Systems,” Proc. Fifth IEEE Int'l Software Metrics Symp., pp. 43-53, Nov. 1998.
[11] L.C. Briand, J.W. Daly, and J. Wüst, “A Unified Framework for Cohesion Measurement in Object-Oriented Systems,” Empirical Software Eng., vol. 3, no. 1, pp. 65-117, 1998.
[12] L.C. Briand, S. Morasca, and V.R. Basili, “Property-Based Software Engineering Measurements,” IEEE Trans. Software Eng., vol. 22, no. 1, pp. 68-85, Jan. 1996.
[13] L.C. Briand, J. Wüst, J.W. Daly, and V.D. Porter, “Exploring the Relationship between Design Measures and Software Quality in Object-Oriented Systems,” J. System and Software, vol. 51, no. 3, pp.245-273, May 2000.
[14] F. Brito e Abreu and M. Goulao, “Coupling and Cohesion as Modularization Drivers: Are We Being Over-Persuaded,” Proc. Fifth European Conf. Software Maintenance and Reeng., pp. 47-57, 2001.
[15] H.S. Chae, Y.R. Kwon, and D.H. Bae, “A Cohesion Measure for Object-Oriented Classes,” Software: Practice and Experience, vol. 30, pp. 1405-1431, 2000.
[16] H.S. Chae, Y.R. Kwon, and D.H. Bae, “Improving Cohesion Metrics for Classes by Considering Dependent Instance Variables,” IEEE Trans. Software Eng., vol. 30, no. 11, pp. 826-832, Nov. 2004.
[17] Z. Chen, Y. Zhou, B. Xu, J. Zhao, and H. Yang, “A Novel Approach to Measuring Class Cohesion Based on Dependence Analysis,” Proc. 18th IEEE Int'l Conf. Software Maintenance, pp. 377-384, 2002.
[18] S. Chidamber, D. Darcy, and C. Kemerer, “Managerial Use of Metrics for Object-Oriented Software: An Exploratory Analysis,” IEEE Trans. Software Eng., vol. 24, no. 8, pp. 629-639, Aug. 1998.
[19] S.R. Chidamber and C.F. Kemerer, “Towards a Metrics Suite for Object-Oriented Design,” Proc. Sixth ACM Conf. Object-Oriented Programming, Systems, Languages and Applications, pp. 197-211, 1991.
[20] S.R. Chidamber and C.F. Kemerer, “A Metrics Suite for Object-Oriented Design,” IEEE Trans. Software Eng., vol. 20, no. 6, pp. 476-493, June 1994.
[21] E.S. Cho, C.J. Kim, D.D. Kim, and S.Y. Rhew, “Static and Dynamic Metrics for Effective Object Clustering,” Proc. Fifth Asia-Pacific Software Eng. Conf., pp. 78-85, 1998.
[22] S. Counsell, S. Swift, and J. Crampton, “The Interpretation and Utility of Three Cohesion Metrics for Object-Oriented Design,” ACM Trans. Software Eng. and Methodology, vol. 15, no. 2, pp. 123-149, 2006.
[23] S. Counsell, S. Swift, and A. Tucker, “Object-Oriented Cohesion as a Surrogate of Software Comprehension: An Empirical Study,” Proc. Fifth IEEE Int'l Workshop Source Code Analysis and Manipulation, pp. 161-172, 2005.
[24] D. Darcy and C. Kemerer, “OO Metrics in Practice,” IEEE Software, vol. 22, no. 6, pp. 17-19, Nov./Dec. 2005.
[25] S. Deerwester, S.T. Dumais, G.W. Furnas, T.K. Landauer, and R. Harshman, “Indexing by Latent Semantic Analysis,” J. Am. Soc. Information Science, vol. 41, pp. 391-407, 1990.
[26] N. Dragan, M.L. Collard, and J.I. Maletic, “Reverse Engineering Method Stereotypes,” Proc. 22nd IEEE Int'l Conf. Software Maintenance, pp. 24-34, Sept. 2006.
[27] S.T. Dumais, “Improving the Retrieval of Information from External Sources,” Behavior Research Methods, Instruments, and Computers, vol. 23, no. 2, pp. 229-236, 1991.
[28] J. Eder, G. Kappel, and M. Schreft, “Coupling and Cohesion in Object-Oriented Systems,” technical report, Univ. of Klagenfurt, 1994.
[29] K. El-Emam, S. Benlarbi, N. Goel, and S.N. Rai, “The Confounding Effect of Class Size on the Validity of Object-Oriented Metrics,” IEEE Trans. Software Eng., vol. 27, no. 7, pp. 630-650, July 2001.
[30] K. El-Emam and K. Melo, “The Prediction of Faulty Classes Using Object-Oriented Design Metrics,” NRC/ERB-1064, vol. 43609, Nov. 1999.
[31] L. Etzkorn and H. Delugach, “Towards a Semantic Metrics Suite for Object-Oriented Design,” Proc. 34th Int'l Conf. Technology of Object-Oriented Languages and Systems, pp. 71-80, July 2000.
[32] L.H. Etzkorn and C.G. Davis, “Automatically Identifying Reusable OO Legacy Code,” Computer, vol. 30, no. 10, pp. 66-72, Oct. 1997.
[33] L.H. Etzkorn, S. Gholston, and W.E. Hughes, “A Semantic Entropy Metric,” J. Software Maintenance: Research and Practice, vol. 14, no. 5, pp. 293-310, July/Aug. 2002.
[34] L.H. Etzkorn, S.E. Gholston, J.L. Fortune, C.E. Stein, D. Utley, P.A. Farrington, and G.W. Cox, “A Comparison of Cohesion Metrics for Object-Oriented Systems,” Information and Software Technology, vol. 46, no. 10, pp. 677-687, Aug. 2004.
[35] R. Ferenc, Á. Beszédes, M. Tarkiainen, and T. Gyimóthy, “Columbus: Reverse Engineering Tool and Schema for C++,” Proc. 18th IEEE Int'l Conf. Software Maintenance, pp. 172-181, Oct. 2002.
[36] R. Ferenc, I. Siket, and T. Gyimóthy, “Extracting Facts from Open Source Software,” Proc. 20th IEEE Int'l Conf. Software Maintenance, pp. 60-69, Sept. 2004.
[37] B. Flyvbjerg, “Five Misunderstandings about Case Study Research,” Qualitative Inquiry, vol. 12, no. 2, pp. 219-245, 2006.
[38] P.W. Foltz, W. Kintsch, and T.K. Landauer, “The Measurement of Textual Coherence with Latent Semantic Analysis,” Discourse Processes, vol. 25, no. 2, pp. 285-307, 1998.
[39] T. Gyimóthy, R. Ferenc, and I. Siket, “Empirical Validation of Object-Oriented Metrics on Open Source Software for Fault Prediction,” IEEE Trans. Software Eng., vol. 31, no. 10, pp. 897-910, Oct. 2005.
[40] M.A.K. Halliday and R. Hasan, Cohesion in English. Longman, 1976.
[41] B. Henderson-Sellers, Software Metrics. Prentice Hall, 1996.
[42] M. Hitz and B. Montazeri, “Measuring Coupling and Cohesion in Object-Oriented Systems,” Proc. Third Int'l Symp. Applied Corporate Computing, Oct. 1995.
[43] J.D. Jobson, Applied Multivariable Data Analysis. Springer-Varlag, 1992.
[44] I.T. Jolliffe, Principal Component Analysis. Springer Verlag, 1986.
[45] M.L. Kherfi, D. Ziou, and A. Bernardi, “Image Retrieval from the World Wide Web: Issues, Techniques, and Systems,” ACM Computing Surveys, vol. 36, no. 1, pp. 35-67, 2004.
[46] W. Kintsch, Comprehension: A Paradigm for Cognition. Cambridge Univ. Press, 1998.
[47] S. Kramer and H. Kaindl, “Coupling and Cohesion Metrics for Knowledge-Based Systems Using Frames and Rules,” ACM Trans. Software Eng. and Methodology, vol. 13, no. 3, pp. 332-358, July 2004.
[48] A. Kuhn, S. Ducasse, and T. Girba, “Enriching Reverse Engineering with Semantic Clustering,” Proc. 12th IEEE Working Conf. Reverse Eng., pp. 133-142, Nov. 2005.
[49] T.K. Landauer and S.T. Dumais, “A Solution to Plato's Problem: The Latent Semantic Analysis Theory of the Acquisition, Induction, and Representation of Knowledge,” Psychological Rev., vol. 104, no. 2, pp. 211-240, 1997.
[50] J.K. Lee, S.J. Jung, S.D. Kim, W.H. Jang, and D.H. Ham, “Component Identification Method with Coupling and Cohesion,” Proc. Eighth Asia-Pacific Software Eng. Conf., pp. 79-86, Dec. 2001.
[51] Y.S. Lee, B.S. Liang, S.F. Wu, and F.J. Wang, “Measuring the Coupling and Cohesion of an Object-Oriented Program Based on Information Flow,” Proc. Int'l Conf. Software Quality, 1995.
[52] R.F. Lorch and E.J. O'Brien, Sources of Coherence in Reading. Erlbaum, 1995.
[53] J.I. Maletic, M.L. Collard, and A. Marcus, “Source Code Files as Structured Documents,” Proc. 10th IEEE Int'l Workshop Program Comprehension, pp. 289-292, June 2002.
[54] J.I. Maletic and A. Marcus, “Supporting Program Comprehension Using Semantic and Structural Information,” Proc. 23rd IEEE Int'l Conf. Software Eng., pp. 103-112, May 2001.
[55] A. Marcus, “Semantic Driven Program Analysis,” PhD dissertation, Kent State Univ., 2003.
[56] A. Marcus, A. De Lucia, J. Huffman Hayes, and D. Poshyvanyk, “Working Session: Information-Retrieval-Based Approaches in Software Evolution,” Proc. 22nd IEEE Int'l Conf. Software Maintenance, pp. 197-199, Sept. 2006.
[57] A. Marcus and J.I. Maletic, “Identification of High-Level Concept Clones in Source Code,” Proc. 16th IEEE Int'l Conf. Automated Software Eng., pp. 107-114, Nov. 2001.
[58] A. Marcus, J.I. Maletic, and A. Sergeyev, “Recovery of Traceability Links between Software Documentation and Source Code,” Int'l J. Software Eng. and Knowledge Eng., vol. 15, no. 4, pp. 811-836, Oct. 2005.
[59] A. Marcus and D. Poshyvanyk, “The Conceptual Cohesion of Classes,” Proc. 21st IEEE Int'l Conf. Software Maintenance, pp. 133-142, Sept. 2005.
[60] A. Marcus, A. Sergeyev, V. Rajlich, and J. Maletic, “An Information Retrieval Approach to Concept Location in Source Code,” Proc. 11th IEEE Working Conf. Reverse Eng., pp. 214-223, Nov. 2004.
[61] T.M. Meyers and D. Binkley, “Slice-Based Cohesion Metrics and Software Intervention,” Proc. 11th IEEE Working Conf. Reverse Eng., pp. 256-265, Nov. 2004.
[62] C. Montes de Oca and D.L. Carver, “Identification of Data Cohesive Subsystems Using Data Mining Techniques,” Proc. 14th IEEE Int'l Conf. Software Maintenance, pp. 16-23, Nov. 1998.
[63] H. Olague, L. Etzkorn, S. Gholston, and S. Quattlebaum, “Empirical Validation of Three Software Metrics Suites to Predict Fault-Proneness of Object-Oriented Classes Developed Using Highly Iterative or Agile Software Development Processes,” IEEE Trans. Software Eng., vol. 33, no. 6, pp. 402-419, June 2007.
[64] L.M. Ott and J.J. Thuss, “Slice Based Metrics for Estimating Cohesion,” Proc. First IEEE Int'l Software Metrics Symp., pp. 71-81, 1993.
[65] S. Patel, W. Chu, and R. Baxter, “A Measure for Composite Module Cohesion,” Proc. 14th IEEE Int'l Conf. Software Eng., pp. 38-48, May 1992.
[66] D. Poshyvanyk, Y.G. Guéhéneuc, A. Marcus, G. Antoniol, and V. Rajlich, “Feature Location Using Probabilistic Ranking of Methods Based on Execution Scenarios and Information Retrieval,” IEEE Trans. Software Eng., vol. 33, no. 6, pp. 420-432, June 2007.
[67] D. Poshyvanyk and A. Marcus, “The Conceptual Coupling Metrics for Object-Oriented Systems,” Proc. 22nd IEEE Int'l Conf. Software Maintenance, pp. 469-478, Sept. 2006.
[68] D. Poshyvanyk and D. Marcus, “Combining Formal Concept Analysis with Information Retrieval for Concept Location in Source Code,” Proc. 15th IEEE Int'l Conf. Program Comprehension, pp. 37-48, June 2007.
[69] T.-S. Quah and M.M.T. Thwin, “Application of Neural Networks for Software Quality Prediction Using Object-Oriented Metrics,” Proc. 19th IEEE Int'l Conf. Software Maintenance, pp. 116-125, Sept. 2003.
[70] M. Sahami and T.D. Heilman, “Web Mining with Search Engines: A Web-Based Kernel Function for Measuring the Similarity of Short Text Snippets,” Proc. 15th Int'l World Wide Web Conf., pp.377-386, 2006.
[71] G. Salton and M. McGill, Introduction to Modern Information Retrieval. McGraw-Hill, 1983.
[72] R. Subramanyam and M.S. Krishnan, “Empirical Analysis of CK Metrics for Object-Oriented Design Complexity: Implications for Software Defects,” IEEE Trans. Software Eng., vol. 29, no. 4, pp.297-310, Apr. 2003.
[73] G. Succi, W. Pedrycz, S. Djokic, P. Zuliani, and B. Russo, “An Empirical Exploration of the Distributions of the Chidamber and Kemerer Object-Oriented Metrics Suite,” Empirical Software Eng., vol. 10, no. 1, pp. 81-104, Jan. 2005.
[74] R.K. Yin, Applications of Case Study Research, second ed. Sage Publications, 2003.
[75] J. Zhao and B. Xu, “Measuring Aspect Cohesion,” Proc. Seventh Int'l Conf. Fundamental Approaches to Software Eng., pp. 54-68, 2004.
[76] Y. Zhou, J. Lu, H. Lu, and B. Xu, “A Comparative Study of Graph Theory-Based Class Cohesion Measures,” ACM SIGSOFT Software Eng. Notes, vol. 29, no. 2, p. 13, Mar. 2004.
[77] Y. Zhou, L. Wen, J. Wang, Y. Chen, H. Lu, and B. Xu, “DRC: A Dependence-Relationships-Based Cohesion Measure for Classes,” Proc. 10th Asia-Pacific Software Eng. Conf., pp. 215-223, 2003.
[78] Y. Zhou, B. Xu, J. Zhao, and H. Yang, “ICBMC: An Improved Cohesion Measure for Classes,” Proc. 18th IEEE Int'l Conf. Software Maintenance, pp. 44-53, Oct. 2002.

Index Terms:
Maintainability, Metrics/Measurement, Quality analysis and evaluation, Restructuring, reverse engineering, and reengineering, Code documentation, Document analysis, Document indexing
Citation:
Andrian Marcus, Denys Poshyvanyk, Rudolf Ferenc, "Using the Conceptual Cohesion of Classes for Fault Prediction in Object-Oriented Systems," IEEE Transactions on Software Engineering, vol. 34, no. 2, pp. 287-300, Mar./Apr. 2008, doi:10.1109/TSE.2007.70768
Usage of this product signifies your acceptance of the Terms of Use.