The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.07 - July (2013 vol.39)
pp: 1002-1017
Barbara Kitchenham , Keele University, Keele
Dag I.K. Sjoberg , University of Oslo, Oslo
Tore Dyba , University of Oslo, Oslo and SINTEF, Trondheim
O. Pearl Brereton , Keele University, Keele
David Budgen , Durham University, Durham
Martin Host , Lund University, Lund
Per Runeson , Lund University, Lund
ABSTRACT
Context: Several text books and papers published between 2000 and 2002 have attempted to introduce experimental design and statistical methods to software engineers undertaking empirical studies. Objective: This paper investigates whether there has been an increase in the quality of human-centric experimental and quasi-experimental journal papers over the time period 1993 to 2010. Method: Seventy experimental and quasi-experimental papers published in four general software engineering journals in the years 1992-2002 and 2006-2010 were each assessed for quality by three empirical software engineering researchers using two quality assessment methods (a questionnaire-based method and a subjective overall assessment). Regression analysis was used to assess the relationship between paper quality and the year of publication, publication date group (before 2003 and after 2005), source journal, average coauthor experience, citation of statistical text books and papers, and paper length. The results were validated both by removing papers for which the quality score appeared unreliable and using an alternative quality measure. Results: Paper quality was significantly associated with year, citing general statistical texts, and paper length (p < 0.05). Paper length did not reach significance when quality was measured using an overall subjective assessment. Conclusions: The quality of experimental and quasi-experimental software engineering papers appears to have improved gradually since 1993.
INDEX TERMS
Software engineering, Guidelines, Correlation, Manuals, Educational institutions, Humans, Materials, software engineering, Quality evaluation, empirical studies, human-centric experiments, experimentation
CITATION
Barbara Kitchenham, Dag I.K. Sjoberg, Tore Dyba, O. Pearl Brereton, David Budgen, Martin Host, Per Runeson, "Trends in the Quality of Human-Centric Software Engineering Experiments--A Quasi-Experiment", IEEE Transactions on Software Engineering, vol.39, no. 7, pp. 1002-1017, July 2013, doi:10.1109/TSE.2012.76
REFERENCES
[1] D.T. Campbell and J.C. Stanley, Experimental and Quasi-Experimental Designs for Research. Houghton Mifflin Company, 1966.
[2] T.D. Cook and D.T. Campbell, Quasi-Experimentation: Design and Analysis Issues for Field Settings. Rand McNally Collage, 1979.
[3] I.K. Crombie, The Pocket Guide to Appraisal. BMJ Books, 1996.
[4] O. Dieste and A.G. Padua, "Developing Search Strategies for Detecting Relevant Experiments for Systematic Reviews," Proc. First Int'l Symp. Empirical Software Eng. and Measurement, pp. 215-224, 2007.
[5] O. Dieste, A. Grimán, N. Juristo, and H. Saxena, "Quantitative Determination of the Relationship between Internal Validity and Bias in Software Engineering: Consequences for Systematic Literature Reviews," Proc. Int'l Symp. Empirical Software Eng. and Metrics, pp. 285-288, 2011.
[6] T. Dybå, V.B. Kampenes, and D.I.K. Sjøberg, "A Systematic Review of Statistical Power in Software Engineering Experiments," Information and Software Technology, vol. 48, no. 8, pp. 745-755, 2006.
[7] L.D. Fisher, D.O. Dixon, J. Herson, R.K. Frankowski, M.S. Hearon, and K.E. Pearce, "Intention to Treat in Clinical Trials," Statistical Issues in Drug Research and Development, K.E. Pearce, ed., pp. 331-350, Marcel Dekker, 1990.
[8] A. Fink, Conducting Research Literature Reviews: From the Internet to Paper. Sage Publication, Inc., 2005.
[9] T. Greenhalgh, How to Read a Paper: The Basics of Evidence-Based Medicine. BMJ Books, 2000.
[10] A. Jedlitschka, M. Ciolkowski, and D. Pfahl, "Reporting Experiments in Software Engineering," Guide to Advanced Empirical Software Eng., F. Shull, J. Singer, and D.I.K. Sjøberg, eds., Springer-Verlag, 2008.
[11] J. Juristo and A. Moreno, Basics of Software Engineering Experimentation. Kluwer Academic Publishers, 2001.
[12] P. Jüni, A. Witschi, R. Bloch, and M. Egger, "The Hazards of Scoring the Quality of Clinical Trials for Meta-Analysis," J. Am. Medical Assoc., vol. 282, no. 11, pp. 1054-1060, 1999.
[13] H. Liu and H.B.K. Tan, "Testing Input Validation in Web Applications through Automated Model Recovery," J. Systems and Software, vol. 81, pp. 222-233, 2007.
[14] V.B. Kampenes, T. Dybå, J.E. Hannay, and D.I.K. Sjøberg, "A Systematic Review of Effect Size in Software Engineering Experiments," Information and Software Technology, vol. 49, no. 11/12, pp. 1073-1086, 2007.
[15] V.B. Kampenes, "Quality of Design Analysis and Reporting of Software Engineering Experiments: A Systematic Review," PhD thesis, Dept. of Informatics, Univ. of Oslo, 2007.
[16] V.B. Kampenes, T. Dybå, J.E. Hannay, and D.I.K. Sjøberg, "A Systematic Review of Quasi-Experiments in Software Engineering," Information and Software Technology, vol. 51, no. 1, pp. 71-82, 2009.
[17] B. Kitchenham, S.L. Pfleeger, L.M. Pickard, P. Jones, D. Hoaglin, K. El Emam, and J. Rosenberg, "Preliminary Guidelines for Empirical Research in Software Engineering," IEEE Trans. Software Eng., vol. 28, no. 8, pp. 721-734, Aug. 2002.
[18] B.A. Kitchenham, D.I.K. Sjøberg, T. Dybå, D. Pfhal, P. Brereton, D. Budgen, M. Höst, and P. Runeson, "Three Empirical Studies on the Agreement of Reviewers about the Quality of Software Engineering Experiments," Information and Software Technology, vol. 54, pp. 804-819, 2012.
[19] B.A. Kitchenham, D.I.K. Sjøberg, O.P. Brereton, D. Budgen, T. Dybå, M. Høst, D. Pfahl, and P. Runeson, "Can We Evaluate the Quality of Software Engineering Experiments?" Proc. Conf. Empirical Software Eng. and Metrics, 2010.
[20] W.F. Rosenberger, "Dealing with Multiplicities in Pharmacoepidermioloical Studies," Pharmacoepidemiology and Drug Safety, vol. 5, pp. 95-100, 1996.
[21] R.L. Rosnow and R. Rosenthal, People Studying People. Artifacts and Ethics in Behavioural Research. W.H. Freeman and Company, 1997.
[22] J. Singer, "Using the APA Style Guidelines to Report Experimental Results," Proc. Workshop Empirical Studies in Software Maintenance, pp. 71-75, 1999.
[23] D.I.K. Sjøberg, J.E. Hannay, O. Hansen, V.B. Kampenes, A. Karahasanovic, N.K. Liborg, and A.C. Rekdal, "A Survey of Controlled Experiments in Software Engineering," IEEE Trans. Software Eng., vol. 31, no. 9, pp.733-753, Sept. 2005.
[24] W.R. Shadish, T.D Cook, and D.T. Campbell, Experimental and Quasi-Experimental Designs for Generalized Causal Inference. Houghton Mifflin Company, 2002.
[25] P.E. Shrout and J.L. Fleiss, "Intraclass Correlations: Uses in Assessing Rater Reliability," Psychological Bull., vol. 86, no. 2, pp. 420-428, 1979.
[26] A.K. Wagner, S.B. Soumerai, F. Zhang, and D. Ross-Degnan, "Segmented Regression Analysis of Interrupted Time Series Studies in Medication Use Research," J. Clinical Pharmacy Therapeutics, vol. 27, pp. 299-309, 2002.
[27] C. Wohlin, P. Runeson, M. Höst, M.C. Ohlsson, B. Regnell, and A. Wesslén, Experimentation in Software Engineering—An Introduction. Kluwer, Academic Press, 2000.
[28] M.A. Wojcicki and P. Strooper, "Maximising the Information Gained by a Study of Static Analysis Technologies for Current Software," Empirical Software Eng., vol. 12, no. 6, pp. 617-645, 2007.
25 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool