This Article 
 Bibliographic References 
 Add to: 
An Extensible System for Source Code Analysis
September 1998 (vol. 24 no. 9)
pp. 721-740

Abstract—Constructing code analyzers may be costly and error prone if inadequate technologies and tools are used. If they are written in a conventional programming language, for instance, several thousand lines of code may be required even for relatively simple analyses. One way of facilitating the development of code analyzers is to define a very high-level domain-oriented language and implement an application generator that creates the analyzers from the specification of the analyses they are intended to perform. This paper presents a system for developing code analyzers that uses a database to store both a no-loss fine-grained intermediate representation and the results of the analyses. The system uses an algebraic representation, called F(p), as the user-visible intermediate representation. Analyzers are specified in a declarative language, called $F(p)-\ell,$ which enables an analysis to be specified in the form of a traversal of an algebraic expression, with access to, and storage of, the database information the algebraic expression indices. A foreign language interface allows the analyzers to be embedded in C programs. This is useful for implementing the user interface of an analyzer, for example, or to facilitate interoperation of the generated analyzers with pre-existing tools. The paper evaluates the strengths and limitations of the proposed system, and compares it to other related approaches.

[1] A.V. Aho, R. Sethi, and J.D. Ullman, Compilers, Principles, Techniques and Tools.New York: Addison-Wesley, 1985.
[2] R. Bache and M. Müllerburg, "Measures of testability as a basis for quality assurance," Software Engineering J., vol. 5, pp. 86-92, Mar. 1990.
[3] P.A. Bailes, P. Burnim, M. Chapman, and D. Johnston, "Derivation and Presentation of an Abstract Program Space for Ada," Proc. Fourth Workshop Program Comprehension, pp. 230-239,Berlin, Germany, IEEE CS Press, 1996.
[4] E. Buss, R.D. Mori, W. Gentleman, J. Henshaw, J. Johnson, K. Kontogianis, E. Merlo, H. Müller, J. Mylopoulos, S. Paul, A. Prakash, M. Stanley, S. Tilley, J. Troster, and K. Wong, "Investigating Reverse Engineering Technologies for the CAS Program Understanding Project," IBM Systems J., vol. 33, no, 3, pp. 477-500, 1994.
[5] F.W. Callis and S.W. Dietrich, "The Application of Deductive Databases to Inter-Module Code Analysis," Proc. Conf. Software Maintenance, pp. 120-128,Sorrento, Italy, IEEE CS Press, 1991.
[6] G. Canfora, A. Cimitile, and U. De Carlini, "A Logic Based Approach to Reverse Engineering Tools Production," IEEE Trans. Software Eng., vol. 18, no. 12, pp. 1,053-1,064, 1992.
[7] G. Cantone, A. Cimitile, and U. De Carlini, "Programs, Graphs and Metrics," DIS-CSCI Technical Report no. 76, Univ. of Naples, 1988.
[8] Y. Chen, M. Nishimito, and C. Ramamoorthy, "C Information Abstraction System," IEEE Trans. Software Eng., vol. 16, no. 3, pp. 325-334, Mar. 1990.
[9] Y.-F. Chen, G.S. Fowler, E. Koutsofios, and R.S. Wallach, "CIAO: A Graphical Navigator for Software and Document Repositories," Int'l Conf. Software Maintenance, pp. 66-75, 1995.
[10] E.J. Chikofsky and J.H. Cross II, "Reverse Engineering and Design Recovery: A Taxonomy," IEEE Software, Vol. 7, No. 1, Jan./Feb. 1990, pp. 13-17.
[11] A. Cimitile and U. de Carlini,“Reverse engineering: Algorithms for program graph reduction,” Software—Practice and Experience, vol. 21, no. 5, pp. 519-537, May 1991.
[12] A. Cimitile and G. Visaggio, "Software Salvaging and Call Dominance Tree," The J. of Systems and Software, vol. 28, no. 2, pp. 117-127, Feb. 1995.
[13] A. Cohen, "AL/1—A Tool for Source Code Analysis," Proc. Sixth European Software Maintenance Workshop, Workshop Notes, Centre for Software Maintenance, Dept. of Computer Science, Univ. of Durham, Durham, UK, 1992.
[14] M. Consens and A. Mendelzon, "Visualizing and Querying Software Structures," Proc. 14th Int'l Conf. Software Eng., pp. 138-156,Melbourne, Australia, IEEE CS Press, 1992.
[15] A. De Lucia, A.R. Fasolino, and M. Munro, "Understanding Function Behaviors Through Program Slicing," Proc. Fourth Workshop on Program Comprehension, pp. 9-18,Berlin, Germany, IEEE CS Press, 1996.
[16] P. Devanbu, "GENOA—A Customizable, Language and Front-End Independent Code Analyzer," Proc. 14th Int'l Conf. Software Eng., May 1992.
[17] P.T. Devanbu, D.S. Rosenblum, and A.L. Wolf, "Automated Construction of Testing and Analysis Tools," Proc. 16th Int'l Conf. Software Eng., pp. 241-250,Sorrento, Italy, IEEE CS Press, 1994.
[18] A. Evans, K.J. Butler, G. Goos, and W.A. Wulf, DIANA Reference Manual, rev. 3, Pittsburgh, Penn.: Tartan Laboratories Inc., 1983.
[19] N.E. Fenton and R.W. Whitty, "Axiomatic Approach to Software Metrication Through Program Decomposition," Computer J., vol. 29, no. 4, pp. 330-339, 1986.
[20] N.E. Fenton and A.A. Kaposi, "Metrics and Software Structure," Information and Software Technology, vol. 29, no. 6, pp. 301-320, 1987.
[21] R. Fiutem, P. Tonella, G. Antoniol, and E. Merlo, "A Cliché-Based Environment to Support Architectural Reverse Engineering," Proc. Int'l Conf. Software Maintenance, pp. 319-328,Monterey, Calif., IEEE CS Press, 1996.
[22] K. Gallagher and J. Lyle, “Using Program Slicing in Software Maintenance,” IEEE Trans. Software Eng., Aug. 1991, pp. 751-761.
[23] E.R. Gansner, E. Koutsojos, S.C. North, and K.-P. Vo, “A Technique for Drawing Directed Graphs,” IEEE Trans. Software Eng., vol. 19, pp. 214–230, 1993.
[24] W. G. Griswold and D. C. Atkinson,“Managing the design tradeoffs for a program understanding and transformation tool,”J. Syst. Software,July 1995.
[25] R. Gupta and M.L. Soffa, "A framework for partial data flow analysis," Proc. Int'l Conf. Software Maintenance, pp. 4-13, Sept. 1994.
[26] M.L. Harrold and B.A. Malloy, "Data Flow Testing of Parallelised Code," Proc. Conf. Software Maintenance, pp. 272-281,Orlando, Fla., IEEE CS Press, 1992.
[27] M.J. Harrold and B. Malloy, "A Unified Interprocedural Program Representation for a Maintenance Environment," IEEE Trans. Software Eng., vol. 19, no. 6, pp. 584-593, 1993.
[28] M.J. Harrold and G. Rothermel, "Aristotle: A System for Research on and Development of Program-Analysis-Based Tools," available from: Organon/ dev1code
[29] M.S. Hecht, Flow Analysis of Computer Programs. North-Holland: Elsevier, 1977.
[30] S. Horwitz, T. Reps, and D. Binkley, “Interprocedural Slicing Using Dependence Graphs,” ACM Trans. Programming Languages and Systems. vol. 12, no. 1, pp. 26-60, Jan. 1990.
[31] IBM COBOL Structuring Facility: MVS, and VM User's Guide. IBM Corp., San Jose, Calif., 1994.
[32] S. Jarzabek and P.K. Tan, "Design of a Generic Reverse Engineering Assistant Tool," Proc. Second Working Conf. Reverse Eng., WCRE '95, Toronto, Canada, pp. 61-70,Los Alamitos, Calif.: IEEE CS Press, July 1995.
[33] S.M. Kearns, "Tlex," Software—Practice and Experience, vol. 21, no. 8, pp. 805-821, 1991.
[34] D.A. Kinloch and M. Munro, "Understanding C Programs Using the Combined C Graph Representation," Proc. Int'l Conf. Software Maintenance, pp. 172-180,Victoria, Canada, IEEE CS Press, 1994.
[35] E. Koutsofios and S.C. North, Editing Graphs with Dotty, user's manual, version 94b, 1994, available from:
[36] F. Lanubile, P. Maresca, and G. Visaggio, "An Environment for Reengineering of Pascal Programs," Proc. Conf. Software Maintenance, pp. 23-30,Sorrento, Italy, IEEE CS Press, 1991.
[37] F. Lanubile and G. Visaggio, "Function Recovery Based on Program Slicing," Proc. Conf. Software Maintenance,Montreal, Quebec, pp. 396-404, 1993.
[38] C. Lindig and G. Snelting, "Assessing Modular Structure of Legacy Code Based on Mathematical Concept Analysis," Proc. the 19th Int'l Conf. Software Eng., pp. 349-359,Boston, Mass.: ACM Press, 1997.
[39] M.A. Linton, "Implementing Relational Views of Programs," Proc. ACM SIGSOFT/SIGPLAN Software Eng. Symp. Practical Software Development Environments,Pittsburgh, pp. 65-72, Apr. 1984.
[40] P.E. Livadas and S.D. Alden, "A Toolset for Program Understanding," Proc. Second Workshop Program Comprehension,Capri, Italy, pp. 110-118, 1993.
[41] A. Maggiolo-Schettini, M. Napoli, and G. Tortora, "Web Structures: A Tool for Representing and Manipulating Programs," IEEE Trans. Software Eng., vol. 14, no. 11, pp. 1,597-1,609, 1988.
[42] A. von Mayrhauser and A.M. Mans, “Identification of Dynamic Comprehension Processes During Large Scale Maintenance,” IEEE Trans. Software Eng., vol. 22, no. 6, pp. 424–437, 1996.
[43] T.J. McCabe, "A Complexity Measure," IEEE Trans. Software Eng., vol. 2, no. 4, pp. 308-320, 1976.
[44] H.A. Muller, M.A. Orgun, S.R. Tilley, and J.S. Uhl, "A Reverse Engineering Approach to Subsystem Structure Identification," J. Software Maintenance: Research and Practice, vol. 8, no. 4, pp. 181-204, 1993.
[45] G.C. Murphy and D. Notkin, “Lightweight Lexical Source Model Extraction,” ACM Trans. Programming Languages and Systems, vol. 5, no. 3, pp. 262–292, July 1996.
[46] J. Mylopoulos, A. Borgida, M. Jarke, and M. Koubarakis, "Telos: Representing Knowledge About Information Systems," ACM Trans. Information Systems, pp. 325-362, vol. 8, Oct. 1990.
[47] S. Paul and A. Prakash, "A Query Algebra for Program Databases," IEEE Trans. Software Eng., vol. 22, no. 3, pp. 202-217, Mar. 1996.
[48] REFINE User's Guide,Palo Alto, Calif.: Reasoning Systems, 1989.
[49] D. Rosenblum and A. Wolf, "Representing Semantically Analyzed C++ Code with REPRISE," Proc. USENIX C++ Conf.,Washington, DC, pp. 119-134, 1991.
[50] S. Rugaber and L. Wills, "Creating a Research Infrastructure for Reengineering," Proc.Third Working Conf. Reverse Eng., pp. 98-102,Monterey, Calif., IEEE CS Press, 1996.
[51] M. Siff and T. Reps, “Identifying Modules via Concept Analysis,” Proc. Int'l Conf. Software Maintenance, pp. 170-179, Oct. 1997.
[52] S.R. Tilley, K. Wong, M.-A.D. Storey, and H.A. Muller, "Programmable reverse Engineering," Int'l J. Software Eng. and Knowledge Eng., vol. 4, no. 4, pp. 501-520, 1994.
[53] M. Weiser, "Program Slicing," IEEE Trans. Software Eng., vol. 10, no. 4, pp. 352-357, 1984.
[54] J. Wielemaker, SWI-Prolog—Reference Manual, Univ. of Amsterdam, 1995, available from:
[55] The REDO Compendium of Reverse Eng. for Software Maintenance. H.J. van Zuylen, ed., Chichester, UK: John Wiley&Sons, 1993.

Index Terms:
Reverse engineering, code analysis, software maintenance, intermediate program representations, tool generation, integration.
Gerardo Canfora, Aniello Cimitile, Ugo De Carlini, Andrea De Lucia, "An Extensible System for Source Code Analysis," IEEE Transactions on Software Engineering, vol. 24, no. 9, pp. 721-740, Sept. 1998, doi:10.1109/32.713328
Usage of this product signifies your acceptance of the Terms of Use.