This Article 
 Bibliographic References 
 Add to: 
Statistical Debugging: A Hypothesis Testing-Based Approach
October 2006 (vol. 32 no. 10)
pp. 831-848
Chao Liu, IEEE
Long Fei, IEEE
Xifeng Yan, IEEE
Jiawei Han, IEEE
Manual debugging is tedious, as well as costly. The high cost has motivated the development of fault localization techniques, which help developers search for fault locations. In this paper, we propose a new statistical method, called Sober, which automatically localizes software faults without any prior knowledge of the program semantics. Unlike existing statistical approaches that select predicates correlated with program failures, Sober models the predicate evaluation in both correct and incorrect executions and regards a predicate as fault-relevant if its evaluation pattern in incorrect executions significantly diverges from that in correct ones. Featuring a rationale similar to that of hypothesis testing, Sober quantifies the fault relevance of each predicate in a principled way. We systematically evaluate Sober under the same setting as previous studies. The result clearly demonstrates the effectiveness: Sober could help developers locate 68 out of the 130 faults in the Siemens suite by examining no more than 10 percent of the code, whereas the Cause Transition approach proposed by Holger et al. [6] and the statistical approach by Liblit et al. [12] locate 34 and 52 faults, respectively. Moreover, the effectiveness of Sober is also evaluated in an "imperfect world,” where the test suite is either inadequate or only partially labeled. The experiments indicate that Sober could achieve competitive quality under these harsh circumstances. Two case studies with grep 2.2 and bc 1.06 are reported, which shed light on the applicability of Sober on reasonably large programs.

[1] E. Clarke, O. Grumberg, and D. Peled, Model Checking. MIT Press, 1999.
[2] W. Visser, K. Havelund, G. Brat, and S. Park, “Model Checking Programs,” Proc. 15th IEEE Int'l Conf. Automated Software Eng. (ASE'00), pp. 3-12, 2000.
[3] M. Musuvathi, D. Park, A. Chou, D. Engler, and D. Dill, “CMC: A Pragmatic Approach to Model Checking Real Code,” Proc. Fifth Symp. Operating System Design and Implementation (OSDI '02), pp.75-88, 2002.
[4] M. Renieris and S. Reiss, “Fault Localization with Nearest Neighbor Queries,” Proc. 18th IEEE Int'l Conf. Automated Software Eng. (ASE '03), pp. 30-39, 2003.
[5] A. Zeller, “Isolating Cause-Effect Chains from Computer Programs,” Proc. ACM Int'l Symp. Foundations of Software Eng. (FSE'02), pp. 1-10, 2002.
[6] H. Cleve and A. Zeller, “Locating Causes of Program Failures,” Proc. 27th Int'l Conf. Software Eng. (ICSE '05), pp. 342-351, 2005.
[7] B. Liblit, A. Aiken, A. Zheng, and M. Jordan, “Bug Isolation via Remote Program Sampling,” Proc. ACM SIGPLAN 2003 Int'l Conf. Programming Language Design and Implementation (PLDI '03), pp.141-154, 2003.
[8] J. Jones and M. Harrold, “Empirical Evaluation of the Tarantula Automatic Fault-Localization Technique,” Proc. 20th IEEE/ACM Int'l Conf. Automated Software Eng. (ASE '05), pp. 273-282, 2005.
[9] N. Gupta, H. He, X. Zhang, and R. Gupta, “Locating Faulty Code Using Failure-Inducing Chops,” Proc. 20th IEEE/ACM Int'l Conf. Automated Software Eng. (ASE '05), pp. 263-272, 2005.
[10] I. Vessey, “Expertise in Debugging Computer Programs,” Int'l J. Man-Machine Studies: A Process Analysis, vol. 23, no. 5, pp. 459-494, 1985.
[11] M. Harrold, G. Rothermel, K. Sayre, R. Wu, and L. Yi, “An Empirical Investigation of the Relationship between Spectra Differences and Regression Faults,” Software Testing, Verification, and Reliability, vol. 10, no. 3, pp. 171-194, 2000.
[12] B. Liblit, M. Naik, A. Zheng, A. Aiken, and M. Jordan, “Scalable Statistical Bug Isolation,” Proc. ACM SIGPLAN 2005 Int'l Conf. Programming Language Design and Implementation (PLDI '05), pp.15-26, 2005.
[13] Y. Brun and M. Ernst, “Finding Latent Code Errors via Machine Learning over Program Executions,” Proc. 26th Int'l Conf. Software Eng. (ICSE '04), pp. 480-490, 2004.
[14] S. Hangal and M. Lam, “Tracking Down Software Bugs Using Automatic Anomaly Detection,” Proc. 24th Int. Conf. Software Eng. (ICSE '02), pp. 291-301, 2002.
[15] G. Casella and R. Berger, Statistical Inference, second ed., Duxbury, 2001.
[16] M. Ernst, J. Cockrell, W. Griswold, and D. Notkin, “Dynamically Discovering Likely Program Invariants to Support Program Evolution,” IEEE Trans. Software Eng., vol. 27, no. 2, pp. 1-25, Feb. 2001.
[17] M. Hutchins, H. Foster, T. Goradia, and T. Ostrand, “Experiments of the Effectiveness of Dataflow- and Controlflow-Based Test Adequacy Criteria,” Proc. 16th Int'l Conf. Software Eng. (ICSE'94), pp. 191-200, 1994.
[18] G. Rothermel and M. Harrold, “Empirical Studies of a Safe Regression Test Selection Technique,” IEEE Trans. Software Eng., vol. 24, no. 6, pp. 401-419, June 1998.
[19] T. Cover and J. Thomas, Elements of Information Theory, first ed. Wiley-Interscience, 1991.
[20] B. Pytlik, M. Renieris, S. Krishnamurthi, and S. Reiss, “Automated Fault Localization Using Potential Invariants,” Proc. Fifth Int'l Workshop Automated and Algorithmic Debugging (AADEBUG '03), pp. 273-276, 2003.
[21] C. Liu, X. Yan, L. Fei, J. Han, and S. Midkiff, “Sober: Statistical Model-Based Bug Localization,” Proc. 10th European Software Eng. Conf./13th ACM SIGSOFT Int'l Symp. Foundations of Software Eng. (ESEC/FSE '05), pp. 286-295, 2005.
[22] T. Zimmermann and A. Zeller, “Visualizing Memory Graphs,” Revised Lectures on Software Visualization, Int'l Seminar, pp.191-204, 2002.
[23] J. Jones, M. Harrold, and J. Stasko, “Visualization of Test Information to Assist Fault Localization,” Proc. 24th Int'l Conf. Software Eng. (ICSE '02), pp. 467-477, 2002.
[24] A. Zeller and R. Hildebrandt, “Simplifying and Isolating Failure-Inducing Input,” IEEE Trans. Software Eng., vol. 28, no. 2, pp. 183-200, Feb. 2002.
[25] J. Misurda, J. Clause, J. Reed, B. Childers, and M. Soffa, “Jazz: A Tool for Demand-Driven Structural Testing,” Proc. 14th Int'l Conf. Compiler Construction (CC '05), pp. 242-245, 2005.
[26] C. Pacheco and M. Ernst, “Eclat: Automatic Generation and Classification of Test Inputs,” Proc. 19th European Conf. Object-Oriented Programming (ECOOP '05), pp. 504-527, 2005.
[27] C. Boyapati, S. Khurshid, and D. Marinov, “Korat: Automated Testing Based on Java Predicates,” Proc. ACM/SIGSOFT Int'l Symp. Software Testing and Analysis (ISSTA '02), pp. 123-133, 2002.
[28] C. Csallner and Y. Smaragdakis, “JCrasher: An Automatic Robustness Tester for Java,” Software—Practice and Experience, vol. 34, no. 11, pp. 1025-1050, 2004.
[29] H. Do, S. Elbaum, and G. Rothermel, “Supporting Controlled Experimentation with Testing Techniques: An Infrastructure and its Potential Impact,” Empirical Software Eng.: An Int'l J., vol. 10, no. 4, pp. 405-435, 2005.
[30] D. Wheeler, SLOCCount: A Set of Tools for Counting Physical Source Lines of Code, http://www.dwheeler.comsloccount/, 2006.
[31] K. Apt and E. Olderog, Verification of Sequential and Concurrent Programs, second ed. Springer-Verlag, 1997.
[32] D. Engler, D. Chen, and A. Chou, “Bugs as Inconsistent Behavior: A General Approach to Inferring Errors in Systems Code,” Proc. Symp. Operating Systems Principles, pp. 57-72, 2001.
[33] H. Agrawal, J. Horgan, S. London, and W. Wong, “Fault Localization Using Execution Slices and Dataflow Tests,” Proc. Sixth Int'l Symp. Software Reliability Eng., pp. 143-151, 1995.
[34] F. Tip, “A Survey of Program Slicing Techniques,” J. Programming Languages, vol. 3, pp. 121-189, 1995.
[35] J. Lyle and M. Weiser, “Automatic Program Bug Location by Program Slicing,” Proc. Second Int'l Conf. Computers and Applications, pp. 877-882, 1987.
[36] Y. Ayalew and R. Mittermeir, “Spreadsheet Debugging,” Proc. European Spreadsheet Risks Interest Group Ann. Conf., 2003.
[37] J. Ruthruff, M. Burnett, and G. Rothermel, “An Empirical Study of Fault Localization for End-User Programmers,” Proc. 27th Int'l Conf. Software Eng. (ICSE '05), pp. 352-361, 2005.
[38] A. Ko and B. Myers, “Designing the Whyline: A Debugging Interface for Asking Questions about Program Behavior,” Proc. SIGCHI Conf. Human Factors in Computing Systems (CHI '04), pp.151-158, 2004.
[39] W. Dickinson, D. Leon, and A. Podgurski, “Finding Failures by Cluster Analysis of Execution Profiles,” Proc. 23rd Int'l Conf. Software Eng. (ICSE '01), pp. 339-348, 2001.
[40] A. Podgurski, D. Leon, P. Francis, W. Masri, M. Minch, J. Sun, and B. Wang, “Automated Support for Classifying Software Failure Reports,” Proc. 25th Int'l Conf. Software Eng. (ICSE '03), pp. 465-475, 2003.

Index Terms:
Debugging aids, statistical methods, statistical debugging.
Chao Liu, Long Fei, Xifeng Yan, Jiawei Han, Samuel P. Midkiff, "Statistical Debugging: A Hypothesis Testing-Based Approach," IEEE Transactions on Software Engineering, vol. 32, no. 10, pp. 831-848, Oct. 2006, doi:10.1109/TSE.2006.105
Usage of this product signifies your acceptance of the Terms of Use.