This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A Survey of Controlled Experiments in Software Engineering
September 2005 (vol. 31 no. 9)
pp. 733-753
The classical method for identifying cause-effect relationships is to conduct controlled experiments. This paper reports upon the present state of how controlled experiments in software engineering are conducted and the extent to which relevant information is reported. Among the 5,453 scientific articles published in 12 leading software engineering journals and conferences in the decade from 1993 to 2002, 103 articles (1.9 percent) reported controlled experiments in which individuals or teams performed one or more software engineering tasks. This survey quantitatively characterizes the topics of the experiments and their subjects (number of subjects, students versus professionals, recruitment, and rewards for participation), tasks (type of task, duration, and type and size of application) and environments (location, development tools). Furthermore, the survey reports on how internal and external validity is addressed and the extent to which experiments are replicated. The gathered data reflects the relevance of software engineering experiments to industrial practice and the scientific maturity of software engineering research.

[1] A. Abran and J.W. Moore, “SWEBOK— Guide to the Software Engineering Body of Knowledge,” 2004 Version, IEEE CS Professional Practices Committee, 2004.
[2] ACM Computing Classification, http://theory.lcs.mit.edu/jacm/CR1991, 1995.
[3] E. Arisholm and D.I.K. Sjøberg, “Evaluating the Effect of a Delegated versus Centralized Control Style on the Maintainability of Object-Oriented Software,” IEEE Trans. Software Eng., vol. 30, no. 8, pp. 521-534, Aug. 2004.
[4] V.R. Basili, “The Experimental Paradigm in Software Engineering,” Experimental Eng. Issues: Critical Assessment and Future Directions, Proc. Int'l Workshop, vol. 706, pp. 3-12, 1993.
[5] V.R. Basili, “The Role of Experimentation in Software Engineering: Past, Current, and Future,” Proc. 18th Int'l Conf. Software Eng., pp. 442-449, 1996.
[6] V.R. Basili, R.W. Selby, and D.H. Hutchens, “Experimentation in Software Engineering,” IEEE Trans. Software Eng., pp. 733-743, July 1986.
[7] V.R. Basili, F. Shull, and F. Lanubile, “Building Knowledge through Families of Experiments,” IEEE Trans. Software Eng., vol. 25, pp. 456-473, July/Aug. 1999.
[8] D.M. Berry and W.F. Tichy, “Response to `Comments on Formal Methods Application: An Empirical Tale of Software Development,'” IEEE Trans. Software Eng., vol. 29, no. 6, pp. 572-575, June 2003.
[9] D.T. Campbell and J.C. Stanley, “Experimental and Quasi-Experimental Designs for Research on Teaching,” Handbook of Research on Teaching, N.L. Cage, ed., Chicago: Rand McNally, 1963.
[10] L.B. Christensen, Experimental Methodology, eighth ed. Boston: Pearson/Allyn & Bacon, 2001.
[11] T.D. Cook and D.T. Campbell, Quasi-Experimentation. Design & Analysis Issues for Field Settings. Houghton Mifflin, 1979.
[12] B. Curtis, “Measurement and Experimentation in Software Engineering,” Proc. IEEE, vol. 68, no. 9, pp. 1144-1157, Sept. 1980.
[13] B. Curtis, “By the Way, Did Anyone Study Real Programmers?” Empirical Studies of Programmers, Proc. First Workshop, pp. 256-262, 1986.
[14] I.S. Deligiannis, M. Shepperd, S. Webster, and M. Roumeliotis, “A Review of Experimental Investigations into Object-Oriented Technology,” Empirical Software Eng., vol. 7, no. 3, pp. 193-231, Sept. 2002.
[15] A. Endres and D. Rombach, A Handbook of Software and Systems Engineering: Empirical Observations, Laws, and Theories, Fraunhofer IESE series on software engineering. Pearson Education Limited, 2003.
[16] N. Fenton, “How Effective Are Software Engineering Methods?” J. Systems and Software, vol. 22, no. 2, pp. 141-146, 1993.
[17] R. Ferber, “Editorial: Research by Convenience,” J. Consumer Research, vol. 4, pp. 57-58, June 1977.
[18] R.L. Glass and T.Y. Chen, “An Assessment of Systems and Software Engineering Scholars and Institutions (1998-2002),” J. Systems and Software, vol. 68, no. 1, pp. 77-84, 2003.
[19] R.L. Glass, V. Ramesh, and I. Vessey, “An Analysis of Research in Computing Disciplines,” Comm. ACM, vol. 47, no. 6, pp. 89-94, June 2004.
[20] R.L. Glass, I. Vessey, and V. Ramesh, “Research in Software Engineering: An Analysis of the Literature,” J. Information and Software Technology, vol. 44, no. 8, pp. 491-506, June 2002.
[21] S. Greenland, J.M. Robins, and J. Pearl, “Confounding and Collapsibility in Causal Inference,” Statistical Science, vol. 14, no. 1, pp. 29-46, 1999.
[22] W. Hayes, “Research Synthesis in Software Engineering: A Case for Meta- Analysis,” Proc. Sixth Int'l Symp. Software Metrics, pp. 143-151, 2003.
[23] M. Höst, C. Wohlin, and T. Thelin, “Experimental Context Classification: Incentives and Experience of Subjects,” Proc. 27th Int'l Conf. Software Eng., pp. 470-478, 2005.
[24] IEEE Keyword Taxo nomy, http://www.computer.org/mc/ keywordssoftware.htm , 2002.
[25] M. Jørgensen, “A Review of Studies on Expert Estimation of Software Development Effort,” J. Systems and Software, vol. 70, nos. 1,2, pp. 37-60, 2004.
[26] M. Jørgensen and D.I.K. Sjøberg, “Generalization and Theory Building in Software Engineering Research,” Empirical Assessment in Software Eng. Proc., pp. 29-36, 2004.
[27] M. Jørgensen, K.H. Teigen, and K. Moløkken, “Better Sure than Safe? Over-Confidence in Judgement Based Software Development Effort Prediction Intervals,” J. Systems and Software, vol. 70, nos. 1,2, pp. 79-93, 2004.
[28] N. Juristo, A.M. Moreno, and S. Vegas, “Reviewing 25 Years of Testing Technique Experiments,” Empirical Software Eng., vol. 9, pp. 7-44, Mar. 2004.
[29] B.A. Kitchenham, “Procedures for Performing Systematic Reviews,” Technical Report TR/SE-0401, Keele University, and Technical Report 0400011T.1, NICTA, 2004.
[30] B.A. Kitchenham, S.L. Pfleeger, L.M. Pickard, P.W. Jones, D.C. Hoaglin, K. El-Emam, and J. Rosenberg, “Preliminary Guidelines for Empirical Research in Software Engineering,” IEEE Trans. Software Eng., vol. 28, no. 8, pp. 721-734, Aug. 2002.
[31] R.M. Lindsay and A.S.C. Ehrenberg, “The Design of Replicated Studies,” The Am. Statistician, vol. 47, pp. 217-228, Aug. 1993.
[32] C. Lott and D. Rombach, “Repeatable Software Engineering Experiments for Comparing Defect-Detection Techniques,” Empirical Software Eng., vol. 1, pp. 241-277, 1996.
[33] J.W. Lucas, “Theory-Testing, Generalization, and the Problem of External Validity,” Sociological Theory, vol. 21, pp. 236-253, Sept. 2003.
[34] T.R. Lunsford and B.R. Lunsford, “The Research Sample, Part I: Sampling,” J. Prosthetics and Orthotics, vol. 7, pp. 105-112, 1995.
[35] Experimental Software Engineering Issues: Critical Assessment and Future Directions, Int'l Workshop Dagstuhl Castle (Germany), Sept. 14-18, 1992, Proc., H.D. Rombach, V.R. Basili, and R.W. Selby, eds. Springer Verlag, 1993.
[36] P. Runeson, “Using Students as Experimental Subjects— An Analysis of Graduate and Freshmen PSP Student Data,” Proc. Empirical Assessment in Software Eng., pp. 95-102, 2003.
[37] W.R. Shadish, T.D. Cook, and D.T. Campbell, Experimental and Quasi-Experimental Designs for Generalized Causal Inference. Houghton Mifflin, 2002.
[38] M. Shaw, “Writing Good Software Engineering Research Papers: Minitutorial,” Proc. 25th Int'l Conf. Software Eng., pp. 726-736, 2003.
[39] D.I.K. Sjøberg, B. Anda, E. Arisholm, T. Dybå, M. Jørgensen, A. Karahasanović, E. Koren, and M. Vokáč, “Conducting Realistic Experiments in Software Engineering,” Proc. 18th Int'l Symp. Empirical Software Eng., pp. 17-26, 2002.
[40] D.I.K. Sjøberg, B. Anda, E. Arisholm, T. Dybå, M. Jørgensen, A. Karahasanović, and M. Vokáč, “Challenges and Recommendations when Increasing the Realism of Controlled Software Engineering Experiments,” Empirical Methods and Studies in Software Eng., pp. 24-38, Springer Verlag, 2003.
[41] W.F. Tichy, “Should Computer Scientists Experiment More? 16 Excuses to Avoid Experimentation,” Computer, vol. 31, no. 5, pp. 32-40, May 1998.
[42] W.F. Tichy, “Hints for Reviewing Empirical Work in Software Engineering,” Empirical Software Eng., vol. 5, no. 4, pp. 309-312, 2000.
[43] W.F. Tichy, P. Lukowicz, L. Prechelt, and E.A. Heinz, “Experimental Evaluation in Computer Science: A Quantitative Study,” J. Systems and Software, vol. 28, no. 1, pp. 9-18, Jan. 1995.
[44] W.M.K Trochim, The Research Methods Knowledge Base, second ed., Cincinnati: Atomic Dog Publishing, 2001.
[45] R. Weber, “Editor's Comments,” MIS Quarterly, vol. 27, no. 3, pp. iii-xii, Sept. 2003.
[46] C. Wohlin, P. Runeson, M. Höst, M.C. Ohlsson, B. Regnell, and A. Wesslen, Experimentation in Software Eng.: An Introduction. John Wiley & Sons Inc., 1999.
[47] R.K. Yin, Case Study Research: Design and Methods. Thousand Oaks, Calif.: Sage, 2003.
[48] E.A. Youngs, “Human Errors in Programming,” Int'l J. Man-Machine Studies, vol. 6, no. 3, pp. 361-376, 1974.
[49] M.V. Zelkowitz and D. Wallace, “Experimental Validation in Software Engineering,” J. Information and Software Technology, vol. 39, pp. 735-743, 1997.
[50] M.V. Zelkowitz and D. Wallace, “Experimental Models for Validating Technology,” Theory and Practice of Object Systems, vol. 31, no. 5, pp. 23-31, May 1998.
[51] A. Zendler, “A Preliminary Software Engineering Theory as Investigated by Published Experiments,” Empirical Software Eng., vol. 6, no. 2, pp. 161-180, 2001.
[52] G.H. Zimney, Method in Experimental Psychology. New York: Ronald Press, 1961.

Index Terms:
Index Terms- Controlled experiments, survey, research methodology, empirical software engineering.
Citation:
Dag I.K. Sj?berg, Jo E. Hannay, Ove Hansen, Vigdis By Kampenes, Amela Karahasanovic, Nils-Kristian Liborg, Anette C. Rekdal, "A Survey of Controlled Experiments in Software Engineering," IEEE Transactions on Software Engineering, vol. 31, no. 9, pp. 733-753, Sept. 2005, doi:10.1109/TSE.2005.97
Usage of this product signifies your acceptance of the Terms of Use.