This Article 
 Bibliographic References 
 Add to: 
Using Origin Analysis to Detect Merging and Splitting of Source Code Entities
February 2005 (vol. 31 no. 2)
pp. 166-181
Merging and splitting source code entities is a common activity during the lifespan of a software system; as developers rethink the essential structure of a system or plan for a new evolutionary direction, so must they be able to reorganize the design artifacts at various abstraction levels as seems appropriate. However, while the raw effects of such changes may be plainly evident in the new artifacts, the original context of the design changes is often lost. That is, it may be obvious which characters of which files have changed, but it may not be obvious where or why moving, renaming, merging, and/or splitting of design elements has occurred. In this paper, we discuss how we have extended origin analysis [1], [2] to aid in the detection of merging and splitting of files and functions in procedural code; in particular, we show how reasoning about how call relationships have changed can aid a developer in locating where merges and splits have occurred, thereby helping to recover some information about the context of the design change. We also describe a case study of these techniques (as implemented in the Beagle tool) using the PostgreSQL database system as the subject.

[1] Q. Tu and M.W. Godfrey, “An Integrated Approach for Studying Software Architectural Evolution,” Proc. Int'l Workshop Program Comprehension (IWPC-02), pp. 127-136, June 2002.
[2] M.W. Godfrey and Q. Tu, “Tracking Structural Evolution Using Origin Analysis,” Proc. Int'l Workshop Principles of Software Evolution (IWPSE-02), pp. 117-119, May 2002.
[3] M. Fowler, Refactoring: Improving the Design of Existing Code. Addison-Wesley, 1999.
[4] W. Opdyke, “Refactoring Object-Oriented Frameworks,” PhD dissertation, Univ. of Illinois at Urbana-Champaign, 1992.
[5] T. Mens, “A Survey of Software Refactoring,” IEEE Trans. Software Eng., vol. 30, no. 2, pp. 126-139, Feb. 2004.
[6] K. Kontogiannis, “Evaluation Experiments on the Detection of Programming Patterns Using Software Metrics,” Proc. Working Conf. Reverse Eng. (WCRE-97), pp. 577-586, Oct. 1997.
[7] L. Zou, “Toward an Improved Understanding of Software Change,” master's thesis, Univ. of Waterloo, 2003.
[8] SWAGkit homepage, http://www.swag.uwaterloo.caswagkit/ 2004.
[9] ScientificToolworks, “Understand for C++,” http://www.scitools. comucpp.html, 2004.
[10] K.W. Church and J.I. Helfman, “Dotplot: A Program for Exploring Self-Similarity in Millions of Lines of Text and Code,” J. Computational and Graphical Statistics, vol. 2, no. 2, pp. 153-174, June 1993.
[11] B.S. Baker, “On Finding Duplication and Near-Duplication in Large Software Systems,” Proc. Working Conf. Reverse Eng. (WCRE-95), pp. 86-95, July 1995.
[12] J.H. Johnson, “Substring Matching for Clone Detection and Change Tracking,” Proc. Int'l Conf. Software Maintenance (ICSM-94), pp. 120-126, Sept. 1994.
[13] S. Ducasse, M. Rieger, and S. Demeyer, “A Language Independent Approach for Detecting Duplicated Code,” Proc. Int'l Conf. Software Maintenance (ICSM-99), pp. 109-118, Sept. 1999.
[14] E. Gamma, R. Helm, R. Johnson, and J. Vlissides, Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley, 1994.
[15] E. van Emden and L. Moonen, “Java Quality Assurance by Detecting Code Smells,” Proc. Working Conf. Reverse Eng. (WCRE-02), pp. 227-237, Oct. 2002.
[16] T. Tourwe and T. Mens, “Automatically Identifying Refactoring Opportunities Using Logic Meta Programming,” Proc. European Conf. Software Maintenance and Reeng. (CSMR-03), pp. 91-100, Mar. 2003.
[17] S. Demeyer, S. Ducasse, and O. Nierstrasz, “Finding Refactorings via Change Metrics,” Proc. Int'l Conf. Object-Oriented Programming Systems, Languages & Applications (OOPSLA-00), pp. 166-177, Oct. 2000.
[18] G. Antoniol, U. Villano, E. Merlo, and M.D. Penta, “Analyzing Cloning Evolution in the Linux Kernel,” Information and Software Technology, vol. 44, no. 13, pp. 755-765, Oct. 2002.
[19] J. Krinke, “Identifying Similar Code with Program Dependence Graphs,” Proc. Working Conf. Reverse Eng. (WCRE-01), pp. 301-309, Oct. 2001.
[20] R. Conradi and B. Westfechtel, “Version Models for Software Configuration Management,” ACM Computing Surveys, vol. 30, no. 2, pp. 232-282, June 1998.
[21] T. Mens, “A State-of-the-Art Survey on Software Merging,” IEEE Trans. Software Eng., vol. 28, no. 5, pp. 449-462, May 2002.
[22] J.J. Hunt and W.F. Tichy, “Extensible Language-Aware Merging,” Proc. 2002 Int'l Conf. Software Maintenance (ICSM-02), pp. 511-520, Oct. 2002.
[23] G. Malpohl, J.J. Hunt, and W.F. Tichy, “Renaming Detection,” Proc. Int'l Conf. Automated Software Eng. (ASE-00), pp. 183-202, Sept. 2000.
[24] N. Gold and A. Mohan, “A Framework for Understanding Conceptual Changes in Evolving Source Code,” Proc. Int'l Conf. Software Maintenance (ICSM-03), pp. 431-439, Sept. 2003.
[25] L. Zou and M.W. Godfrey, “Detecting Merging and Splitting Using Origin Analysis,” Proc. Working Conf. Reverse Eng. (WCRE-03), pp. 146-154, Nov. 2003.

Index Terms:
Software evolution, origin analysis, restructuring, reverse engineering, and reengineering.
Michael W. Godfrey, Lijie Zou, "Using Origin Analysis to Detect Merging and Splitting of Source Code Entities," IEEE Transactions on Software Engineering, vol. 31, no. 2, pp. 166-181, Feb. 2005, doi:10.1109/TSE.2005.28
Usage of this product signifies your acceptance of the Terms of Use.