This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Automatic Mining of Source Code Repositories to Improve Bug Finding Techniques
June 2005 (vol. 31 no. 6)
pp. 466-480
We describe a method to use the source code change history of a software project to drive and help to refine the search for bugs. Based on the data retrieved from the source code repository, we implement a static source code checker that searches for a commonly fixed bug and uses information automatically mined from the source code repository to refine its results. By applying our tool, we have identified a total of 178 warnings that are likely bugs in the Apache Web server source code and a total of 546 warnings that are likely bugs in Wine, an open-source implementation of the Windows API. We show that our technique is more effective than the same static analysis that does not use historical data from the source code repository.

[1] Apache Web Server, httpd, available online at http:/httpd. apache.org, 2004.
[2] K. Ashcraft and D. Engler, “Using Programmer-Written Compiler Extensions to Catch Security Holes,” Proc. IEEE Symp. Security and Privacy, May 2002.
[3] T. Ball and S.K. Rajamani, “The SLAM Project: Debugging System Software via Static Analysis,” Proc. 29th Symp. Principles of Programming Languages (POPL '02), pp. 1-3, Jan. 2002.
[4] J. Bevan and E.J. Whitehead, “Identification of Software Instabilities,” Proc. 10th Working Conf. Reverse Eng. (WCRE '03), pp. 134-143, Nov. 2003.
[5] A. Chen, E. Chou, J. Wong, A.Y. Yao, Q. Zhang, S. Zhang, and A. Michal, “CVSSearch: Searching through Source Code using CVS Comments,” Proc. IEEE Int'l Conf. Software Maintenance (ICSM '01), pp. 364-373, Nov. 2001.
[6] D. Cubranic, “Project History as a Group Memory: Learning from the Past,” PhD thesis, Univ. of British Columbia, 2004.
[7] CVSConcurrent Versions System, available online at http:/www.cvshome.org, 2004.
[8] A. Descartes and T. Bunce, Programming the Perl DBI. O'Reilly, 2000.
[9] D. Engler, B. Chelf, A. Chou, and S. Hallem, “Checking System Rules Using System Specific, Programmer-Written Compiler Extensions,” Proc. Fourth Symp. Operating Systems Design and Implementation, Oct. 2000.
[10] R. Ferenc, I. Siket, and T. Gyimothy, “Extracting Facts from Open Source Software,” Proc. 20th Int'l Conf. Software Maintenance (ICSM '04), pp. 60-69, Sept. 2004.
[11] M. Fischer and H. Gall, “Visualizing Feature Evolution of Large-Scale Software based on Problem and Modification Report Data,” J. Software Maintenance and Evolution: Research and Practice, vol. 16, pp. 385-403, Nov./Dec. 2004.
[12] M. Fischer, M. Pinzger, and H. Gall, “Analyzing and Relating Bug Report Data for Feature Tracking,” Proc. 10th Working Conf. Reverse Eng. (WCRE '03), pp. 90-99, Nov. 2003.
[13] D.M. German, “An Empirical Study of Fine-Grained Software Modifications,” Proc. 20th Int'l Conf. Software Maintenance (ICSM '04), pp. 316-325, Sept. 2004.
[14] T.L. Graves, A.F. Karr, J.S. Marron, and H. Siy, “Predicting Fault Incidence Using Software Change History,” IEEE Trans. Software Eng., vol. 26, no. 7, pp. 653-661, July 2000.
[15] A.E. Hassan and R.C. Holt, “Predicting Change Propagation in Software Systems,” Proc. 20th Int'l Conf. Software Maintenance (ICSM '04), pp. 284-293, Sept. 2004.
[16] D.L. Heine and M.S. Lam, “A Practical Flow-Sensitive and Context-Sensitive C and C++ Memory Leak Detector,” Proc. Conf. Programming Language Design and Implementation (PLDI '03), June 2003.
[17] D. Hovemeyer and W. Pugh, “Finding Bugs Is Easy,” Companion of the 19th Ann. ACM SIGPLAN Conf. Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA '04), Oct. 2004.
[18] S. Johnson, Unix Time Sharing System Programmer's Manual, seventh ed. vol. 2A, AT&T Bell Laboratories 1979.
[19] T. Kremeneck and D. Engler, “Z-Ranking: Using Statistical Analysis to Counter the Impact of Static Analysis Approximations,” Proc. 10th Ann. Int'l Static Analysis Symp. (SAS '03), pp. 295-315, June 2003.
[20] T. Matsumura, A. Monden, and K. Matsumoto, “The Detection of Faulty Code Violating Implicit Coding Rules,” Proc. Int'l Workshop Principles of Software Evolution (IWPSE '02), pp. 15-21, May 2002.
[21] T. Menzies, J.S. DiStefano, C. Cunanan, and R. Chapman, “Mining Repositories to Assist in Project Planning and Resource Allocation,” Proc. Int'l Workshop Mining Software Repositories (MSR '04), May 2004.
[22] T.J. Ostrand, E.J. Weyuker, and R.M. Bell, “Where the Bugs Are,” Proc. 2004 ACM SIGSOFT Int'l Symp. Software Testing and Analysis (ISSTA '04), July 2004.
[23] R. Purushothaman and D.E. Perry, “Towards Understanding the Rhetoric of Small Changes,” Proc. Int'l Workshop Mining Software Repositories (MSR '04), May 2004.
[24] D. Quinlan, “ROSE: A Preprocessor Generation Tool for Leveraging the Semantics of Parallel Object-Oriented Frameworks to Drive Optimizations via Source Code Transformations,” Proc. Eighth Int'l Workshop Compilers for Parallel Computers (CPC '00), Jan. 2000.
[25] RCS, available online at http://www.cs.purdue.edu/homes/trinkle/RCS index.html, 2004.
[26] F. Rysselberghe and S. Demeyer, “Mining Version Control Systems for FACs (Frequently Applied Changes),” Proc. Int'l Workshop Mining Software Repositories (MSR '04), May 2004.
[27] R.M. Stallman, Using the GNU Compiler Collection. GNU Press, 2004.
[28] M. Widenius and D. Axmark, MySQL Reference Manual Documentation from the Source. O'Reilly, 2002.
[29] C.C. Williams and J.K. Hollingsworth, “Bug Driven Bug Finders,” Proc. Int'l Workshop Mining Software Repositories (MSR '04), May 2004.
[30] Wine, available online at http:/www.winehq.org, 2004.
[31] T. Zimmermann and P. Weissgerber, “Preprocessing CVS Data for Fine-Grained Analysis,” Proc. Int'l Workshop Mining Software Repositories (MSR '04), May 2004.

Index Terms:
Index Terms- Testing tools, version control, configuration control, debugging aids.
Citation:
Chadd C. Williams, Jeffrey K. Hollingsworth, "Automatic Mining of Source Code Repositories to Improve Bug Finding Techniques," IEEE Transactions on Software Engineering, vol. 31, no. 6, pp. 466-480, June 2005, doi:10.1109/TSE.2005.63
Usage of this product signifies your acceptance of the Terms of Use.