This Article 
 Bibliographic References 
 Add to: 
Identifying Modules via Concept Analysis
November/December 1999 (vol. 25 no. 6)
pp. 749-768

Abstract—We describe a general technique for identifying modules in legacy code. The method is based on concept analysis—a branch of lattice theory that can be used to identify similarities among a set of objects based on their attributes. We discuss how concept analysis can identify potential modules using both “positive” and “negative” information. We present an algorithmic framework to construct a lattice of concepts from a program, where each concept represents a potential module. We define the notion of a concept partition, present an algorithm for discovering all concept partitions of a given concept lattice, and prove the algorithm correct.

[1] G. Canfora, A. Cimitile, M. Tortorella, and M. Munro, “Experiments in Identifying Reusable Abstract Data Types in Program Code,” Proc. Second Worshop Program Comprehension, pp. 36-45, 1993.
[2] A. Yeh, D.R. Harris, and H.B. Reubenstein, “Recovering Abstract Data Types and Object Instances from a Conventional Procedural Language,” Proc. Second Working Conf. Reverse Eng., pp. 227-236, 1995.
[3] C. Lindig and G. Snelting, "Assessing Modular Structure of Legacy Code Based on Mathematical Concept Analysis," Proc. the 19th Int'l Conf. Software Eng., pp. 349-359,Boston, Mass.: ACM Press, 1997.
[4] R. Wille, “Restructuring Lattice Theory: An Approach Based on Hierarchies of Concepts,” Ordered Sets, I. Rival, ed., pp. 445-470, NATO Advanced Study Inst., Sept. 1981.
[5] R. Godin, R. Missaoui, and H. Alaouii, “Incremental Concept Formation Algorithms Based on Galois (Concept) Lattices,” Computational Intelligence, vol. 11, no. 2, pp. 246-267, 1995.
[6] G. Snelting, “Reengineering of Configurations Based on Mathematical Concept Analysis,” ACM Trans. Software Eng. and Methodology, vol. 5, no. 2, pp. 146-189, Apr. 1996.
[7] M. Weiser, “Program Slicing,” IEEE Trans. Software Eng., vol. 10, no. 4, pp. 352-357, July 1984.
[8] S. Horwitz, T. Reps, and D. Binkley, “Interprocedural Slicing Using Dependence Graphs,” ACM Trans. Programming Languages and Systems. vol. 12, no. 1, pp. 26-60, Jan. 1990.
[9] H.A. Sahraoui, W. Melo, H. Lounis, and F. Dumont, “Applying Concept Formation Methods to Object Identification in Procedural Code,” Technical Report CRIM-97/05-77, CRIM 1997.
[10] M. Siff and T. Reps, “Program Generalization for Software Reuse: From C to C++,” Proc. Fourth ACM SIGSOFT Symp. Foundations of Software Eng., pp. 135-146, San Francisco, Oct. 1996.
[11] R. O'Callahan and D. Jackson, "Lackwit: A Program Understanding Tool Based on Type Inference," Proc. Int'l Conf. Software Engineering, IEEE Computer Soc. Press, Los Alamitos, Calif., 1997, pp. 338-348.
[12] J.-F. Girard and R. Koschke, “Finding Components in a Hierarchy of Modules: A Step Towards Architectural Understanding,” Proc. Int'l Conf. Software Maintenance, pp. 58-65, Bari, Italy, Oct. 1997.
[13] J.-F. Girard, personal communication, July 1998.
[14] S. Liu and N. Wilde, "Identifying Objects in a Conventional Procedural Language: An Example of Data Design Recovery," Proc. Conf. Software Maintenance,San Diego, Calif., pp. 266-271, 1990.
[15] A. Cimitile, M. Tortorella, and M. Munro, "Program Comprehension Through the Idenfication of Abstract Data Types," Third Workshop on Program Comprehension, WPC'93,Washington, D.C., pp. 12-19, Nov. 1994.
[16] P. Newcomb, “Reengineering Procedural into Object-Oriented Systems,” Proc. Second Working Conf. Reverse Eng., pp. 237-249, July 1995.
[17] P.E. Livadas and T. Johnson, “A New Approach to Finding Objects in Programs,” Software Maintenance: Research and Practice, vol. 6, pp. 249-260, 1994.
[18] G. Canfora, A. Cimitile, and M. Munro, “An Improved Algorithm for Identifying Objects in Code,” Software—Practice and Experience, vol. 26, no. 1, pp. 25-58, Jan. 1996.
[19] G. Canfora, A. De Lucia, G.A. Di Lucca, and A.R. Fasolino, "Recovering the Architectural Design for Software Comprehension," Third Workshop on Program Comprehension, WPC'93,Washington, D.C., pp. 30-38, Nov. 1994.
[20] B.L. Achee and D.L. Carver, “A Greedy Approach to Object Identification in Imperative Code,” Proc. Third Workshop Program Comprehension, pp. 4-11, 1994.
[21] D.H. Hutchens and V.R. Basili, “System Structure Analysis: Clustering with Data Bindings,” IEEE Trans. Software Eng., vol. 11, no. 8, pp. 749–757, Aug. 1985.
[22] T. Kunz, “Evaluating Process Clusters to Support Automatic Program Understanding,” Proc. Fourth Workshop Program Comprehension, pp. 198-207, 1996.
[23] B.A. Davey and H.A. Priestley, Introduction to Lattices and Order. Cambridge Univ. Press, 1990.
[24] R. Godin and H. Mili,“Building and maintaining analysis-level class hierarchies using galois lattices,” ACM SIGPLAN Notices, OOPSLA’93 Proc., vol. 28, pp. 394-410,Washington, D.C. Sept.26 - Oct.1, 1993,.
[25] R. Godin, H. Mili, G.W. Mineau, R. Missaoui, A. Arfi, and T.-T. Chau, “Design of Class Hierarchies Based on Concept (Galois) Lattices,” Theory and Practice of Object Systems, vol. 4, no. 2, pp. 117-134, 1998.
[26] G. Snelting and F. Tip, “Reengineering Class Hierarchies using Concept Analysis,” Proc. Sixth SIGSOFT Symp. Foundations of Software Eng., pp. 99-110, Nov. 1998.
[27] R. Godin, E. Sauders, and J. Gecsei, “Lattice Model of Browsable Data Spaces,” Information Science, vol. 40, pp. 89-116, 1986.

Index Terms:
Concept analysis, modularization, software migration, software restructuring, reverse engineering, design recovery.
Michael Siff, Thomas Reps, "Identifying Modules via Concept Analysis," IEEE Transactions on Software Engineering, vol. 25, no. 6, pp. 749-768, Nov.-Dec. 1999, doi:10.1109/32.824377
Usage of this product signifies your acceptance of the Terms of Use.