This Article 
 Bibliographic References 
 Add to: 
Predicting with Sparse Data
November 2001 (vol. 27 no. 11)
pp. 987-998

Abstract—It is well-known that effective prediction of project cost related factors is an important aspect of software engineering. Unfortunately, despite extensive research over more than 30 years, this remains a significant problem for many practitioners. A major obstacle is the absence of reliable and systematic historic data, yet this is a sine qua non for almost all proposed methods: statistical, machine learning or calibration of existing models. In this paper, we describe our sparse data method (SDM) based upon a pairwise comparison technique and Saaty's Analytic Hierarchy Process (AHP). Our minimum data requirement is a single known point. The technique is supported by a software tool known as DataSalvage. We show, for data from two companies, how our approach—based upon expert judgement—adds value to expert judgement by producing significantly more accurate and less biased results. A sensitivity analysis shows that our approach is robust to pairwise comparison errors. We then describe the results of a small usability trial with a practicing project manager. From this empirical work, we conclude that the technique is promising and may help overcome some of the present barriers to effective project prediction.

[1] Software Metrics Definition Working Group, “Software Size Measurement with Applications to Source Statement Counting,” Software Eng. Inst., Carnegie Mellon, Draft for Review, Aug. 1991.
[2] R. Jeffery, M. Ruhe, and I. Wieczorek, “Using Public Domain Metrics to Estimate Software Development Effort,” Proc. Seventh IEEE Int'l Metrics Symp., 2001.
[3] B. Boehm, Software Engineering Economics, Prentice Hall, Upper Saddle River, N.J., 1981, pp. 533-535.
[4] L.H. Putnam, “The Real Economics of Software Development,” The Economics of Information Processing, R. Goldberg and H. Lorin, eds., New York: Wiley, 1982.
[5] C. Kemerer, "An Empirical Validation of Software Cost Estimation Models," Comm. ACM, vol. 30, pp. 416-429, May 1987.
[6] B.A. Kitchenham and A.P. Kitchenham, “The Use of Software Metrics to Evaluate Software Production Methods,” Proc. Seminare Approches Quantitatives en Genie Logiciel, 1984.
[7] B.A. Kitchenham, "Empirical Studies of Assumptions that Underlie Software Cost-Estimation Models," Information and Software Technology, vol. 34, no. 4, pp. 211-218, 1992.
[8] D.R. Jeffery and G.C. Low, “Calibrating Estimation Tools for Software Development,” Software Eng. J. vol. 5, pp. 215-221, 1990.
[9] R. Gulezian, “Reformulating and Calibrating COCOMO,” J. Systems Software, vol. 16, pp. 235-242, 1991.
[10] P. Kok, B.A. Kitchenham, and J. Kirakowski, “The MERMAID Approach to Software Cost Estimation,” Proc. Esprit Technical Week, 1990.
[11] G. Wittig and G. Finnie, “Estimating Software Development Effort with Connectionists Models,” Information&Software Technology, vol. 39, pp. 469-476, 1997.
[12] C. Mair, G. Kadoda, M. Lefley, K. Phalp, C. Schofield, M. Shepperd, and S. Webster, “An Investigation of Machine Learning Based Prediction Systems,” J. Systems Software, vol. 53, pp. 23-29, 2000.
[13] M.J. Shepperd, C. Schofield, and B.A. Kitchenham, “Effort Estimation Using Analogy,” Proc. 18th Int'l Conf. Software Eng., 1996.
[14] F.J. Heemstra, “Software Cost Estimation,” Information&Software Technology, vol. 34, pp. 627-639, 1992.
[15] R.T. Hughes, “Expert Judgement as an Estimating Method,” Information&Software Technology, vol. 38, pp. 67-75, 1996.
[16] J.S. Busby and S.C. Barton, “Predicting the Cost of Engineering: Does Intuition Help or Hinder?” Eng. Management J., pp. 177-182, 1996.
[17] D. Kahneman and A. Tversky, “Intuitive Prediction: Biases and Corrective Procedures,” TIMS Studies in Management Science, vol. 12, pp. 313-327, 1979.
[18] D. Kahneman and D. Lovallo, “Timid Choices and Bold Forecasts—A Cognitive Perspective on Risk-Taking,” Management Science, vol. 39, pp. 17-31, 1993.
[19] R. Buehler, D. Griffin, and M. Ross, “Exploring the 'Planning Fallacy': Why People Underestimate their Task Completion Times,” J. Personality&Social Psychology, vol. 67, pp. 366-381, 1994.
[20] T. DeMarco, Controlling Software Projects. Management, Measurement and Estimation. New York: Yourdon Press, 1982.
[21] M. Turoff and S.R. Hiltz, “Computer Based Delphi Processes,” Gazing Into the Oracle: The Delphi Method and Its Application to Social Policy and Public Health, M. Adler and E. Ziglio, eds., London: Kingsley, 1995.
[22] T.L. Saaty, The Analytic Hierarchy Process. New York: McGraw-Hill, 1980.
[23] T.L. Saaty, “Highlights and Critical Points in the Theory and Application of the Analytic Hierarchy Process,” European J. Operations Research, vol. 74, pp. 426-447, 1994.
[24] S. Barker, M.J. Shepperd, and M. Aylett, “Analytic Hierarchy Processing and Almost Data-Free Effort Prediction,” Proc. 10th European Software Control and Metrics Conf., 1999.
[25] E. Miranda, “An Evaluation of the Paired Comparisons Method for Software Sizing,” Proc. 22nd IEEE Int'l Conf. Software Eng., 2000.
[26] G. Bozoki, “Software Sizing Models,” Proc. Third COCOMO Users Group Meeting, 1987.
[27] V. Belton and T. Gear, “On a Shortcoming of Saaty's Method of Analytic hierarchies,” Omega, vol. 11, pp. 228-230, 1983.
[28] A. Stam and A.P.D. Silva, “Stochastic Judgements in the AHP: The Measurement of Rank Reversal Probabilities,” Decision Sciences, vol. 28, pp. 655-688, 1997.
[29] M.J. Shepperd, M.H. Cartwright, and G.F. Kadoda, “On Building Prediction Systems for Software Engineers,” Empirical Software Eng., vol. 5, pp. 175-182, 2000.
[30] G.J. Bozoki, “Performance Simulation of SSM,” Proc. 13th Ann. Conf. Int'l Soc. Parametric Analysts, 1991.
[31] E. Miranda, “Improving Subvective Estimates Using Paired Coaparisons,” IEEE Software, vol. 18, pp. 87-91, 2001.
[32] M.J. Shepperd and C. Schofield, “Estimating Software Project Effort Using Analogies,” IEEE Trans. Software Eng., vol. 23, pp. 736-743, 1997.

Index Terms:
Prediction, software project effort, expert judgement, empirical data, sparse data.
Martin Shepperd, Michelle Cartwright, "Predicting with Sparse Data," IEEE Transactions on Software Engineering, vol. 27, no. 11, pp. 987-998, Nov. 2001, doi:10.1109/32.965339
Usage of this product signifies your acceptance of the Terms of Use.