This Article 
 Bibliographic References 
 Add to: 
Checking Inside the Black Box: Regression Testing by Comparing Value Spectra
October 2005 (vol. 31 no. 10)
pp. 869-883
Tao Xie, IEEE
Comparing behaviors of program versions has become an important task in software maintenance and regression testing. Black-box program outputs have been used to characterize program behaviors and they are compared over program versions in traditional regression testing. Program spectra have recently been proposed to characterize a program's behavior inside the black box. Comparing program spectra of program versions offers insights into the internal behavioral differences between versions. In this paper, we present a new class of program spectra, value spectra, that enriches the existing program spectra family. We compare the value spectra of a program's old version and new version to detect internal behavioral deviations in the new version. We use a deviation-propagation call tree to present the deviation details. Based on the deviation-propagation call tree, we propose two heuristics to locate deviation roots, which are program locations that trigger the behavioral deviations. We also use path spectra (previously proposed program spectra) to approximate the program states in value spectra. We then similarly compare path spectra to detect behavioral deviations and locate deviation roots in the new version. We have conducted an experiment on eight C programs to evaluate our spectra-comparison approach. The results show that both value-spectra-comparison and path-spectra-comparison approaches can effectively expose program behavioral differences between program versions even when their program outputs are the same, and our value-spectra-comparison approach reports deviation roots with high accuracy for most programs.

[1] D. Abramson, I. Foster, J. Michalakes, and R. Socic, “Relative Debugging: A New Methodology for Debugging Scientific Applications,” Comm. ACM, vol. 39, no. 11, pp. 69-77, 1996.
[2] T. Ball and J.R. Larus, “Efficient Path Profiling,” Proc. 29th Int'l Symp. Microarchitecture, pp. 46-57, 1996.
[3] B. Calder, P. Feller, and A. Eustace, “Value Profiling,” Proc. 30th Int'l Symp. Microarchitecture, pp. 259-269, 1997.
[4] Y.-F. Chen, D.S. Rosenblum, and K.-P. Vo, “TestTube: A System for Selective Regression Testing,” Proc. 16th Int'l Conf. Software Eng., pp. 211-220, 1994.
[5] Daikon invariant detector tool, 2005,
[6] R.A. DeMillo, R.J. Lipton, and F.G. Sayward, “Hints on Test Data Selection: Help for the Practicing Programmer,” Computer, vol. 11, no. 4, pp. 34-41, Apr. 1978.
[7] R.A. DeMillo and A.J. Offutt, “Constraint-Based Automatic Test Data Generation,” IEEE Trans. Software Eng., vol. 17, no. 9, pp. 900-910, Sept. 1991.
[8] S. Elbaum, A.G. Malishevsky, and G. Rothermel, “Test Case Prioritization: A Family of Empirical Studies,” IEEE Trans. Software Eng., vol. 28, no. 2, pp. 159-182, Feb. 2002.
[9] M.D. Ernst, J. Cockrell, W.G. Griswold, and D. Notkin, “Dynamically Discovering Likely Program Invariants to Support Program Evolution,” IEEE Trans. Software Eng., vol. 27, no. 2, pp. 99-123, Feb. 2001.
[10] M. Fowler, Refactoring: Improving the Design of Existing Code. Addison Wesley, 1999.
[11] GNU, GNU Diffutils,, 2002.
[12] T.L. Graves, M.J. Harrold, J.-M. Kim, A. Porter, and G. Rothermel, “An Empirical Study of Regression Test Selection Techniques,” ACM Trans. Software Eng. Methodology, vol. 10, no. 2, pp. 184-208, 2001.
[13] M.J. Harrold, G. Rothermel, K. Sayre, R. Wu, and L. Yi, “An Empirical Investigation of the Relationship between Spectra Differences and Regression Faults,” J. Software Testing, Verification and Reliability, vol. 10, no. 3, pp. 171-194, 2000.
[14] W.E. Howden, “Weak Mutation Testing and Completeness of Test Sets,” IEEE Trans. Software Eng., vol. 8, no. 4, pp. 371-379, July 1982.
[15] M. Hutchins, H. Foster, T. Goradia, and T. Ostrand, “Experiments of the Effectiveness of Dataflow- and Controlflow-Based Test Adequacy Criteria,” Proc. 16th Int'l Conf. Software Eng., pp. 191-200, 1994.
[16] C. Jaramillo, R. Gupta, and M.L. Soffa, “Debugging and Testing Optimizers through Comparison Checking,” Proc. Int'l Workshop Compiler Optimization Meets Compiler Verification, Apr. 2002.
[17] B. Korel and A.M. Al-Yami, “Automated Regression Test Generation,” Proc. ACM SIGSOFT Int'l Symp. Software Testing and Analysis, pp. 143-152, 1998.
[18] B. Korel and J. Laski, “Dynamic Program Slicing,” Information Processing Letters, vol. 29, no. 3, pp. 155-163, 1988.
[19] H.K.N. Leung and L. White, “A Study of Integration Testing and Software Regression at the Integration Level,” Proc. Int'l Conf. Software Maintenance, pp. 290-300, 1990.
[20] L.J. Morell, “A Theory of Fault-Based Testing,” IEEE Trans. Software Eng., vol. 16, no. 8, pp. 844-857, Aug. 1990.
[21] N. Nethercote and J. Seward, “Valgrind: A Program Supervision Framework,” Proc. Third Workshop Runtime Verification, July 2003.
[22] J.D. Reese and N.G. Leveson, “Software Deviation Analysis,” Proc. Int'l Conf. Software Eng., pp. 250-260, 1997.
[23] S.P. Reiss and M. Renieris, “Encoding Program Executions,” Proc. Int'l Conf. Software Eng., pp. 221-230, 2001.
[24] T. Reps, T. Ball, M. Das, and J. Larus, “The Use Of Program Profiling for Software Maintenance with Applications to the Year 2000 Problem,” Proc. Sixth European Software Eng. Conf. (ESEC) and Seventh ACM SIGSOFT Int'l Symp. the Foundations of Software Eng., pp. 432-449, 1997.
[25] G. Rothermel and M.J. Harrold, “A Safe, Efficient Regression Test Selection Technique,” ACM Trans. Software Eng. Methodology, vol. 6, no. 2, pp. 173-210, 1997.
[26] G. Rothermel, M.J. Harrold, J. Ostrin, and C. Hong, “An Empirical Study of the Effects of Minimization on the Fault Detection Capabilities of Test Suites,” Proc. Int'l Conf. Software Maintenance, pp. 34-43, 1998.
[27] G. Rothermel, R. Untch, C. Chu, and M.J. Harrold, “Prioritizing Test Cases for Regression Testing,” IEEE Trans. Software Eng., vol. 27, no. 10, pp. 929-948, Oct. 2001.
[28] M.C. Thompson, D.J. Richardson, and L.A. Clarke, “An Information Flow Model of Fault Detection,” Proc. Int'l Symp. Software Testing and Analysis, pp. 182-192, 1993.
[29] J.M. Voas, “PIE: A Dynamic Failure-Based Technique,” IEEE Trans. Software Eng., vol. 18, no. 8, pp. 717-727, July 1992.
[30] F.I. Vokolos and P.G. Frankl, “Empirical Evaluation of the Textual Differencing Regression Testing Technique,” Proc. Int'l Conf. Software Maintenance, pp. 44-53, 1998.
[31] L. White and H.K.N. Leung, “A Firewall Concept for Both Control-Flow and Data-Flow in Regression Integration Testing,” Proc. Int'l Conf. Software Maintenance, pp. 262-271, 1992.
[32] N. Wilde The RECON Software Reconnaissance Tool, Feb. 2003, http://www.cs.uwf.edurecon/.
[33] W.E. Wong, J.R. Horgan, S. London, and H.A. Bellcore, “A Study of Effective Regression Testing in Practice,” Proc. Eighth Int'l Symp. Software Reliability Eng., pp. 264-274, 1997.
[34] T. Xie, D. Marinov, and D. Notkin, “Rostra: A Framework for Detecting Redundant Object-Oriented Unit Tests,” Proc. 19th Int'l Conf. Automated Software Eng., pp. 196-205, Sept. 2004.
[35] T. Xie and D. Notkin, “Checking Inside the Black Box: Regression Testing Based on Value Spectra Differences,” Proc. Int'l Conf. Software Maintenance, pp. 28-37, Sept. 2004.
[36] T. Zimmermann and A. Zeller, “Visualizing Memory Graphs,” Dagstuhl Seminar on Software Visualization, 2001.

Index Terms:
Index Terms- Program spectra, regression testing, software testing, empirical studies, software maintenance.
Tao Xie, David Notkin, "Checking Inside the Black Box: Regression Testing by Comparing Value Spectra," IEEE Transactions on Software Engineering, vol. 31, no. 10, pp. 869-883, Oct. 2005, doi:10.1109/TSE.2005.107
Usage of this product signifies your acceptance of the Terms of Use.