This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Test Case Prioritization: A Family of Empirical Studies
February 2002 (vol. 28 no. 2)
pp. 159-182

To reduce the cost of regression testing, software testers may prioritize their test cases so that those which are more important, by some measure, are run earlier in the regression testing process. One potential goal of such prioritization is to increase a test suite's rate of fault detection. Previous work reported results of studies that showed that prioritization techniques can significantly improve rate of fault detection. Those studies, however, raised several additional questions: 1) Can prioritization techniques be effective when targeted at specific modified versions; 2) what trade-offs exist between fine granularity and coarse granularity prioritization techniques; 3) can the incorporation of measures of fault proneness into prioritization techniques improve their effectiveness? To address these questions, we have performed several new studies in which we empirically compared prioritization techniques using both controlled experiments and case studies. The results of these studies show that each of the prioritization techniques considered can improve the rate of fault detection of test suites overall. Fine-granularity techniques typically outperformed coarse-granularity techniques, but only by a relatively small margin overall; in other words, the relative imprecision in coarse-granularity analysis did not dramatically reduce coarse-granularity techniques' ability to improve rate of fault detection. Incorporation of fault-proneness techniques produced relatively small improvements over other techniques in terms of rate of fault detection, a result which ran contrary to our expectations. Our studies also show that the relative effectiveness of various techniques can vary significantly across target programs. Furthermore, our analysis shows that whether the effectiveness differences observed will result in savings in practice varies substantially with the cost factors associated with particular testing processes. Further work to understand the sources of this variance and to incorporate such understanding into prioritization techniques and the choice of techniques would be beneficial.

[1] IEEE Standards Association, Software Engineering Standards, vol. 3 of Std. 1061: Standard for Software Quality Methodology, IEEE, 1999 ed., 1999.
[2] A. Avritzer and E.J. Weyuker, “The Automatic Generation of Load Test Suites and the Assessment of the Resulting Software,” IEEE Trans. Software Eng., vol. 21, no. 9, pp. 705–716, Sept. 1995.
[3] A.L. Baker, J.M. Bieman, N. Fenton, D.A. Gustafson, A. Melton, and R. Whitty, “Philosophy for Software Measurement,” J. System Software, vol. 12, no. 3, pp. 277–281, 1990.
[4] M. Balcer, W. Hasling, and T. Ostrand, “Automatic Generation of Test Scripts from Formal Test Specifications,” Proc. Third Symp. Software Testing, Analysis, and Verification, pp. 210–218, Dec. 1989.
[5] L.C. Briand, J. Wust, S.V. Ikonomovski, and H. Lounis, “Investigating Quality Factors in Object Oriented Designs: An Industrial Case Study,” Proc. Int'l. Conf. Software Eng., pp. 345–354, May 1999.
[6] M.E. Delamaro and J.C. Maldonado, “Proteum—A Tool for the Assessment of Test Adequacy for C Programs,” Proc. Conf. Performability in Computing Systems (PCS '96), pp. 79–95, July 1996.
[7] R.A. DeMillo, R.J. Lipton, and F.G. Sayward, “Hints on Test Data Selection: Help for the Practicing Programmer,” Computer, vol. 11, no. 4, pp. 34–41, Apr. 1978.
[8] S. Elbaum, A. Malishevsky, and G. Rothermel, “Test Case Prioritization: A Family of Empirical Studies,” Technical Report 01-60-08, Oregon State Univ., May 2001.
[9] S.G. Elbaum and J.C. Munson, “A Standard for the Measurement of C Complexity Attributes,” Technical Report TR-CS-98-02, Univ. of Idaho, Feb. 1998.
[10] S.G. Elbaum and J.C. Munson, “Code Churn: A Measure for Estimating the Impact of Code Change,” Proc. Int'l Conf. Software Maintenance, pp. 24–31, Nov. 1998.
[11] S.G. Elbaum and J.C. Munson, “Software Evolution and the Code Fault Introduction Process,” Empirical Software Eng. J., vol. 4, no. 3, pp. 241–262, Sept. 1999.
[12] N. Fenton and L. Pfleeger, Software Metrics–A Rigorous and Practical Approach, second ed. Boston, PWS-Publishing, 1997.
[13] D. Gable and S. Elbaum, “Extension of Fault Proneness Techniques,” Technical Report TRW-SW-2001-2, Univ. of Nebraska, Lincoln, Feb. 2001.
[14] T. Goradia, “Dynamic Impact Analysis: A Cost-Effective Technique to Enforce Error-Propagation,” Proc. ACM Int'l Symp. Software Testing and Analysis, pp. 171–181, June 1993.
[15] R.G. Hamlet, “Testing Programs with the Aid of a Compiler,” IEEE Trans. Software Eng., vol. 3, no. 4, pp. 279–290, July 1977.
[16] R.G. Hamlet, “Probable Correctness Theory,” Information Processing Letters, vol. 25, pp. 17–25, Apr. 1987.
[17] M.J. Harrold and G. Rothermel, “Aristotle: A System for Research on and Development of Program Analysis Based Tools,” Technical Report OSU-CISRC- 3/97-TR17, Ohio State Univ., Mar. 1997.
[18] M. Hutchins, H. Foster, T. Goradia, and T. Ostrand, “Experiments on the Effectiveness of Dataflow- and Controlflow-Based Test Adequacy Criteria,” Proc. Int'l Conf. Software Eng., pp. 191–200, May 1994.
[19] R.A. Johnson and D.W. Wichorn, Applied Multivariate Analysis, third ed. Englewood Cliffs, N.J.: Prentice Hall, 1992.
[20] T.M. Khoshgoftaar and J.C. Munson, “Predicting Software Development Errors Using Complexity Metrics,” J. Selected Areas Comm., vol. 8, no. 2, pp. 253–261, Feb. 1990.
[21] R.E. Kirk, Experimental Design: Procedures for the Behavioral Sciences, third ed. Pacific Grove, Calif.: Brooks/Cole, 1995.
[22] B. Kitchenham, L. Pickard, and S. Pfleeger, “Case Studies for Method and Tool Evaluation,” IEEE Software, vol. 11, no. 4, pp. 52–62, July 1995.
[23] F. Lanubile, A. Lonigro, and G. Visaggio, “Comparing Models for Identifying Fault-Prone Software Components,” Proc. Seventh Int'l Conf. Software Eng. and Knowledge Eng., pp. 312–319, June 1995.
[24] J.C. Munson, “Software Measurement: Problems and Practice,” Annals of Software Eng., vol. 1, no. 1, pp. 255–285, 1995.
[25] J. Musa, Software Reliability Engineering. New York: McGraw-Hill, 1998.
[26] A.P. Nikora and J.C. Munson, “Software Evolution and the Fault Process,” Proc. 23rd Ann. Software Eng. Workshop, 1998.
[27] A.J. Offutt, A. Lee, G. Rothermel, R. Untch, and C. Zapf, “An Experimental Determination of Sufficient Mutation Operators,” ACM Trans. Software Eng. Methods, vol. 5, no. 2, pp. 99–118, Apr. 1996.
[28] T.J. Ostrand and M.J. Balcer, “The Category-Partition Method for Specifying and Generating Functional Tests,” Comm. ACM, vol. 31, no. 6, June 1988.
[29] G. Rothermel, R.H. Untch, C. Chu, and M.J. Harrold, “Test Case Prioritization: An Empirical Study,” Proc. Int'l Conf. Software Maintenence, pp. 179–188, Aug. 1999.
[30] G. Rothermel, R.H. Untch, C. Chu, and M.J. Harrold, “Prioritizing Test Cases for Regression Testing,” IEEE Trans. Software Eng., vol. 27, no. 10, pp. 929-948, Oct. 2001.
[31] M.C. Thompson, D.J. Richardson, and L.A. Clarke, “An Information Flow Model of Fault Detection,” Proc. ACM Int'l Symp. Software Testing and Analysis, pp. 182–192, June 1993.
[32] J. Voas, “PIE: A Dynamic Failure-Based Technique,” IEEE Trans. Software Eng., vol. 18, no. 8, pp. 717–727, Aug. 1992.
[33] F.I. Vokolos and P.G. Frankl, “Empirical Evaluation of the Textual Differencing Regression Testing Technique,” Proc. Int'l Conf. Software Maintenence, pp. 44–53, Nov. 1998.
[34] W.E. Wong, J.R. Horgan, S. London, and H. Agrawal, “A Study of Effective Regression Testing in Practice,” Proc. Eighth Int'l Symp. Software Reliability Eng., pp. 230–238 Nov. 1997.
[35] M. Zelkowitz and D. Wallace, “Experimental Models for Validating Technology,” Computer, vol. 31, no. 5, pp. 23–31, May 1998.

Index Terms:
Test case prioritization, regression testing, empirical studies.
Citation:
S. Elbaum, A.G. Malishevsky, G. Rothermel, "Test Case Prioritization: A Family of Empirical Studies," IEEE Transactions on Software Engineering, vol. 28, no. 2, pp. 159-182, Feb. 2002, doi:10.1109/32.988497
Usage of this product signifies your acceptance of the Terms of Use.