This Article 
 Bibliographic References 
 Add to: 
Building Knowledge through Families of Experiments
July/August 1999 (vol. 25 no. 4)
pp. 456-473

Abstract—Experimentation in software engineering is necessary but difficult. One reason is that there are a large number of context variables and, so, creating a cohesive understanding of experimental results requires a mechanism for motivating studies and integrating results. It requires a community of researchers that can replicate studies, vary context variables, and build models that represent the common observations about the discipline. This paper discusses the experience of the authors, based upon a collection of experiments, in terms of a framework for organizing sets of related studies. With such a framework, experiments can be viewed as part of common families of studies, rather than being isolated events. Common families of studies can contribute to important and relevant hypotheses that may not be suggested by individual experiments. A framework also facilitates building knowledge in an incremental manner through the replication of experiments within families of studies. To support the framework, this paper discusses the experiences of the authors in carrying out empirical studies, with specific emphasis on persistent problems encountered in experimental design, threats to validity, criteria for evaluation, and execution of experiments in the domain of software engineering.

[1] F.T. Baker, “Chief Programmer Team Management of Production Programming,” IBM Systems J., vol. 11, no. 1, 1972.
[2] V.R. Basili, "The Experimental Paradigm in Software Engineering," Proc. Int'l Workshop Experimental Software Eng. Issues, Lecture Notes in Computer Science, Vol. 706 1992, Springer-Verlag, Berlin, pp. 3-12.
[3] V.R. Basili, "Evolving and Packaging Reading Technologies," J. Systems and Software, vol. 38, no. 1, July 1997, pp. 3-12.
[4] V. Basili, G. Caldiera, F. Lanubile, and F. Shull, “Studies on Reading Techniques,” Proc. 21st Ann. Software Eng. Workshop, pp. 59–65, SEL-96-002, Goddard Space Flight Center, Greenbelt, Md., Dec. 1996.
[5] V.R. Basili, S. Green, O. Laitenberger, F. Lanubile, F. Shull, S. Soerumgaard, and M. Zelkowitz, “The Empirical Investigation of Perspective-Based Reading,” Empirical Software Eng. J., vol. 1, no. 2, 1996.
[6] V.R. Basili and D.H. Hutchens, “An Empirical Study of a Syntactic Metric Family,” IEEE Trans. Software Eng., vol. 9, no. 6, pp. 664–672, Nov. 1983.
[7] V.R. Basili and R.W. Reiter, “A Controlled Experiment Quantitatively Comparing Software Development Approaches,” IEEE Trans. Software Eng., vol. 7, no. 3, pp. 299–320, May 1981.
[8] V.R. Basili and H.D. Rombach, "The TAME Project: Towards Improvement-Oriented Software Environments," IEEE Trans. Software Eng., Vol. 14, No. 6, 1988, pp. 758-773.
[9] V.R. Basili and R.W. Selby, “Comparing the Effectiveness of Software Testing Strategies,” IEEE Trans. Software Eng., vol. 13, pp. 1,278-1,296, 1987.
[10] V.R. Basili, R.W. Selby, and D.H. Hutchens, "Experimentation in Software Engineering," IEEE Trans. Software Eng., vol. 12, pp. 733-743, 1986.
[11] V.R. Basili, F. Lanubile, and F. Shull, “Investigating Maintenance Processes in a Framework-Based Environment,” Proc. Int'l Conf. Software Maintenance, pp. 256–264, Bethesda, Md., 1998.
[12] L.C. Briand, K. El Emam, and S. Morasca, “On the Application of Measurement Theory in Software Engineering,” Empirical Software Eng. J., vol. 1, no. 1, pp. 61–88, 1996.
[13] R. Brooks, “Studying Programmer Behavior Experimentally: The Problems of Proper Methodology,” Comm. ACM, vol. 23, no. 4, pp. 207–213, Apr. 1980.
[14] A. Brooks, J. Daly, J. Miller, M. Roper, and M. Wood, “Replication of Experimental Results in Software Engineering,” Technical Report, EFoCS-17-95 [RR/95/193], Dept. of Computer Science, Univ. of Strathclyde, 1995.
[15] D.T. Campbell and J.C. Stanley, Experimental and Quasi-Experimental Designs for Research. Boston: Houghton Mifflin Co., 1963.
[16] M. Ciolkowski, C. Differding, O. Laitenberger, and J. Munch, “Empirical Investigation of Perspective-Based Reading: A Replicated Experiment,” Technical Report ISERN-97-13, Int'l Software Eng. Research Network, 1997.
[17] Composable Systems Group, “Model Problems,” 1995. /
[18] T.D. Cook and D.T. Campbell, Quasi-Experimentation: Design and Analysis Issues for Field Settings. Boston: Houghton Mifflin Co., 1979.
[19] N.E. Fenton, S. Lawrence Pfleeger, and R. Glass, “Science and Substance: A Challenge to Software Engineers,” IEEE Software, vol. 11, no. 4, pp. 86–95, July 1994.
[20] N. Fenton and L. Pfleeger, Software Metrics–A Rigorous and Practical Approach, second ed. Boston, PWS-Publishing, 1997.
[21] P. Fusaro and F. Lanubile, “A Replicated Experiment to Assess Requirements Inspection Techniques,” Empirical Software Eng., vol. 2, no. 1, pp. 39–57, 1997.
[22] J. Gilgun, “Definitions, Methodologies, and Methods in Qualitative Family Research,” Qualitative Methods in Family Research, J. Gilgun, K. Daly, and G. Handel, eds., Sage Publications, 1992.
[23] G.J. Hidding, “Reinventing Methodology: Who Reads it and Why?” Comm. ACM, vol. 40, no. 11, pp. 102-109, Nov. 1997.
[24] IEEE Software Engineering Standards, IEEE CS Press, 1987.
[25] C.M. Judd, E.R. Smith, and L.H. Kidder, Research Methods in Social Relations, sixth ed., Orlando: Harcourt Brace Jova novich, 1991.
[26] F. Lanubile, “Empirical Evaluation of Software Maintenance Technologies,” Empirical Software Eng. J., vol. 2, no. 2, pp. 95–106, 1997.
[27] F. Lanubile, “Report on The Results of The Parallel Project Meeting Reading Techniques,” Oct. 1997. index.htm
[28] F. Lanubile, F. Shull, and V.R. Basili, “Experimenting with Error Abstraction in Requirements Documents,” Proc. Fifth Int'l Symp. Software Metrics, pp. 114–121, Bethesda, Md., 1998.
[29] S.A. Lee, “A Scientific Methodology for MIS Case Studies,” MIS Quarterly, vol. 13, no. 1, pp. 33–50, 1989.
[30] C.M. Lott and H.D. Rombach, “Repeatable Software Engineering Experiments for Comparing Defect-Detection Techniques,” Empirical Software Eng. J., vol. 1, no. 3, pp. 241–277, 1996.
[31] J. Miller, J. Daly, M. Wood, M. Roper, and A. Brooks, “Statistical Power and Its Subcomponents—Missing and Misunderstood Concepts in Software Engineering Empirical Research,” J. Information and Software Technology, vol. 39, pp. 285–295, 1997.
[32] J. Miller, M. Wood, and M. Roper, “Further Experiences with Scenarios and Checklists,” Empirical Software Eng., vol. 3, no. 1, pp. 37–64, 1998.
[33] D.C. Montgomery, Design and Analysis of Experiments. fourth ed., John Wiley and Sons, 1997.
[34] K. Popper, The Logic of Scientific Discovery. New York: Harper Torchbooks, 1968.
[35] A. Porter and L. Votta, “Comparing Detection Methods for Software Requirements Inspections: A Replication Using Professional Subjects,” Empirical Software Eng., vol. 3, pp. 355–379, 1998.
[36] A.A. Porter, L.G. Votta, and V.R. Basili, “Comparing Detection Methods for Software Requirements Inspections: A Replicated Experiment,” IEEE Trans. Software Eng., vol. 21, no. 6, pp. 563-575, June 1995.
[37] K. Sandahl, O. Blomkvist, J. Karlsson, C. Krysander, M. Lindvall, and N. Ohlsson, “An Extended Replication of an Experiment for Assessing Methods for Software Requirements Inspections,” Empirical Software Eng., vol. 3 pp. 327–254, 1998.
[38] C.B. Seaman and V.R. Basili, “Communication and Organization: An Empirical Study of Discussion in Inspection Meetings,” IEEE Trans. Software Eng., vol. 24, no. 6, June 1998.
[39] B.A. Sheil, "The Psychological Study of Programming," ACM Computing Surveys, 1981.
[40] F.J. Shull, “Developing Techniques for Using Software Documents: A Series of Empirical Studies,” PhD thesis, Univ. of Maryland, College Park,, 1998.
[41] F. Shull, “Reading Techniques for Object-Oriented Frameworks,” projects/SoftEng/ESEG/manual/sbr_package manual.html
[42] F. Shull, “Lab Package for the Empirical Investigation of Perspective-Based Reading,” manual/pbr_packagemanual.html.
[43] F. Shull, F. Lanubile, and V.R. Basili, “Investigating Reading Techniques for Framework Learning,” Technical Report CS-TR-3896, UMCP Dept. of Computer Science, UMIACS-TR-98-26, UMCP Inst. for Advanced Computer Studies, ISERN-98-16 Int'l Software Eng. Research Network, May 1998.
[44] S. Sørumgård, “An Empirical Study of Process Conformance,” Proc. 21st Ann. Software Eng. Workshop, pp. 115–124, SEL-96-002, Goddard Space Flight Center, Greenbelt, Md., Dec. 1996.
[45] W. Tichy et al., "Experimental Evaluation in Computer Science: A Quantitative Study," J. of Systems and Software, Vol. 28, No. 1, 1995, pp. 9-18.
[46] C. Wohlin and P. Runeson eds., Introduction to Experimentation in Software Engineering, Technical Report, LUTEDX (TETS-7167), Dept. of Comm. Systems, Lund Inst. of Technology, Lund Univ., 1997.
[47] M. Zelkowitz, and D. Wallace, “Experimental Models for Validating Technology,” Computer, vol. 31, no. 5, pp. 23–31, May 1998.
[48] Z. Zhang, V.R. Basili, and B. Shneiderman, “An Empirical Study of Perspective-Based Usability Inspection,” Human Factors and Ergonomics Soc. Ann. Meeting, Chicago, Oct. 1998.

Index Terms:
Empirical software engineering, experimental design, software process, software measurement, software reading techniques.
Victor R. Basili, Forrest Shull, Filippo Lanubile, "Building Knowledge through Families of Experiments," IEEE Transactions on Software Engineering, vol. 25, no. 4, pp. 456-473, July-Aug. 1999, doi:10.1109/32.799939
Usage of this product signifies your acceptance of the Terms of Use.