This Article 
 Bibliographic References 
 Add to: 
Estimating Software Project Effort Using Analogies
November 1997 (vol. 23 no. 11)
pp. 736-743

Abstract—Accurate project effort prediction is an important goal for the software engineering community. To date most work has focused upon building algorithmic models of effort, for example COCOMO. These can be calibrated to local environments. We describe an alternative approach to estimation based upon the use of analogies. The underlying principle is to characterize projects in terms of features (for example, the number of interfaces, the development method or the size of the functional requirements document). Completed projects are stored and then the problem becomes one of finding the most similar projects to the one for which a prediction is required. Similarity is defined as Euclidean distance in n-dimensional space where n is the number of project features. Each dimension is standardized so all dimensions have equal weight. The known effort values of the nearest neighbors to the new project are then used as the basis for the prediction. The process is automated using a PC-based tool known as ANGEL. The method is validated on nine different industrial datasets (a total of 275 projects) and in all cases analogy outperforms algorithmic models based upon stepwise regression. From this work we argue that estimation by analogy is a viable technique that, at the very least, can be used by project managers to complement current estimation techniques.

[1] D.W. Aha, "Case-Based Learning Algorithms," Proc. 1991 DARPA Case-Based Reasoning Workshop. Morgan Kaufmann, 1991.
[2] A.J. Albrecht and J.R. Gaffney, "Software Function, Source Lines of Code, and Development Effort Prediction: A Software Science Validation," IEEE Trans. Software Eng., vol. 9, no. 6, pp. 639-648, 1983.
[3] K. Atkinson and M.J. Shepperd, "The Use of Function Points to Find Cost Analogies," Proc. European Software Cost Modelling Meeting,Ivrea, Italy, 1994.
[4] B.W. Boehm, "Software Engineering Economics," IEEE Trans. Software Eng., vol. 10, no. 1, pp. 4-21, 1984.
[5] L.C. Briand, V.R. Basili, and W.M. Thomas, "A Pattern Recognition Approach for Software Engineering Data Analysis," IEEE Trans. Software Eng., vol. 18, no. 11, pp. 931-942, 1992.
[6] S.D. Conte, H. E. Dunsmore, and V. Y. Shen, Software Engineering Metrics and Models, Benjamin/Cummings, Menlo Park, Calif., 1986.
[7] J.M. Desharnais, "Analyse statistique de la productivitie des projets informatique a partie de la technique des point des fonction," masters thesis, Univ. of Montreal, 1989.
[8] R.T. Hughes, "Expert Judgement as an Estimating Method," Information and Software Technology, vol. 38, no. 2, pp. 67-75, 1996.
[9] D.R. Jeffery, G.C. Low, and M. Barnes, "A Comparison of Function Point Counting Techniques," IEEE Trans. Software Eng., vol. 19, no. 5, pp. 529-532, 1993.
[10] R. Jeffery and J. Stathis, "Specification Based Software Sizing: An Empirical Investigation of Function Metrics," Proc. NASA Goddard Software Eng. Workshop.Greenbelt, Md., 1993.
[11] N. Karunanithi, D. Whitley, and Y.K. Malaiya, "Using Neural Networks in Reliability Prediction," IEEE Software, vol. 9, no. 4, pp. 53-59, 1992.
[12] C. Kemerer, "An Empirical Validation of Software Cost Estimation Models," Comm. ACM, vol. 30, pp. 416-429, May 1987.
[13] B.A. Kitchenham and K. Kansala, “Inter-Item Correlations Among Function Points,” Proc. First Int'l Software Metrics Symp., pp. 11-14, 1993.
[14] B.A. Kitchenham and N.R. Taylor, "Software Cost Models," ICL Technology J., vol. 4, no. 3, pp. 73-102, 1984.
[15] P. Kok, B.A. Kitchenham, and J. Kirakowski, "The MERMAID Approach to Software Cost Estimation," Proc. ESPRIT Technical Week, 1990.
[16] J.L. Kolodner, Case-Based Reasoning, Morgan Kaufmann, San Francisco, Calif., 1993.
[17] J.E. Matson, B.E. Barret, and J.M. Mellichamp, “Software Development Cost Estimation Using Function Points,” IEEE Trans. Software Eng., vol. 20, no. 4, pp. 275–287, Apr. 1994.
[18] Y. Miyazaki and K. Mori, "COCOMO Evaluation and Tailoring," Proc. Eighth Int'l Software. Eng. Conf.London: IEEE CS Press, 1985.
[19] Y. Miyazaki et al., "Method to Estimate Parameter Values in Software Prediction Models," Information and Software Technology, vol. 33, no. 3, pp. 239-243, 1991.
[20] T. Mukhopadhyay, S.S. Vicinanza, and M.J. Prietula, "Examining the Feasibility of a Case-Based Reasoning Model for Software Effort Estimation," MIS Quarterly, vol. 16, pp. 155-171, June, 1992.
[21] A. Porter and R. Selby, "Empirically Guided Software Development Using Metric-Based Classification Trees," IEEE Software, no. 7, pp. 46-54, 1990.
[22] A. Porter and R. Selby, "Evaluating Techniques for Generating Metric-Based Classification Trees," J. Systems Software, vol. 12, pp. 209-218, 1990.
[23] E. Rich and K. Knight, Artificial Intelligence, second edition. McGraw-Hill, 1995.
[24] B. Samson, D. Ellison, and P. Dugard, "Software Cost Estimation Using an Albus Perceptron (CMAC)," Information and Software Technology, vol. 39, nos. 1/2, 1997.
[25] C. Serluca, "An Investigation into Software Effort Estimation Using a Back Propagation Neural Network," MSc dissertation, Bournemouth Univ., 1995.
[26] M.J. Shepperd, C. Schofield, and B.A. Kitchenham, “Effort Estimation Using Analogy,” Proc. 18th Int'l Conf. Software Eng., 1996.
[27] K. Srinivasan and D. Fisher, “Machine Learning Approaches to Estimating Software Development Effort,” IEEE Trans. Software Eng., vol. 21, no. 2, pp. 126–137, Feb. 1995.
[28] G.E. Wittig and G.R. Finnie, "Using Artificial Neural Networks and Function Points to Estimate 4GL Software Development effort," Australian J. Information Systems, vol. 1, no. 2, pp. 87-94, 1994.

Index Terms:
Effort prediction, estimation process, empirical investigation, analogy, case-based reasoning.
Martin Shepperd, Chris Schofield, "Estimating Software Project Effort Using Analogies," IEEE Transactions on Software Engineering, vol. 23, no. 11, pp. 736-743, Nov. 1997, doi:10.1109/32.637387
Usage of this product signifies your acceptance of the Terms of Use.