This Article 
 Bibliographic References 
 Add to: 
A Probabilistic Model for Predicting Software Development Effort
July 2005 (vol. 31 no. 7)
pp. 615-624
Recently, Bayesian probabilistic models have been used for predicting software development effort. One of the reasons for the interest in the use of Bayesian probabilistic models, when compared to traditional point forecast estimation models, is that Bayesian models provide tools for risk estimation and allow decision-makers to combine historical data with subjective expert estimates. In this paper, we use a Bayesian network model and illustrate how a belief updating procedure can be used to incorporate decision-making risks. We develop a causal model from the literature and, using a data set of 33 real-world software projects, we illustrate how decision-making risks can be incorporated in the Bayesian networks. We compare the predictive performance of the Bayesian model with popular nonparametric neural-network and regression tree forecasting models and show that the Bayesian model is a competitive model for forecasting software development effort.

[1] J. Baik, B. Boehm, and B.M. Steece, “Disaggregating and Calibrating the CASE Tool Variable in COCOMOII,” IEEE Trans. Software Eng., vol. 28, no. 11, pp. 1009-1022, Nov. 2002.
[2] R.D. Banker and S.A. Slaughter, “A Field Study of Scale Economies in Software Maintenance,” Management Science, vol. 43, no. 12, pp. 1709-1725, 1997.
[3] S. Bhattacharyya and P.C. Pendharkar, “Inductive, Evolutionary, and Neural Computing Techniques for Discrimination: A Comparative Study,” Decision Sciences, vol. 24, no. 4, pp. 871-899, 1998.
[4] B. Boehm, C. Abts, A. Brown, S. Chulani, B. Clark, and E. Horowitz, Software Cost Estimation with COCOMO II. Prentice Hall, 2001.
[5] B.W. Boehm, B. Clark, C. Horowitz, C. Westland, R. Madachy, and R. Selby, “Cost Models for Future Software Life Cycle Processes: COCOMO 2.0.0,” Annals of Software Eng., vol. 1, no. 1, pp. 1-30, 1995.
[6] L. Breiman, J.H. Friedman, R.A. Olshen, and C.J. Stone, Classification and Regression Trees. Belmont, Calif.: Wadsworth Int'l Group, 1984.
[7] C.J. Burgess and L. Lefley, “Can Genetic Programming Improve Software Effort Estimation? A Comparative Evaluation,” Information and Software Technology, vol. 43, no. 14, pp. 863-873, 2001.
[8] M.S. Chen, J. Han, and P.S. Yu, “Data Mining: An Overview from a Database Perspective,” IEEE Trans. Knowledge and Data Eng., vol. 8, no. 6, pp. 866-883, Dec. 1996.
[9] S. Chulani, B. Boehm, and B. Steece, “Bayesian Analysis of Empirical Software Engineering Cost Models,” IEEE Trans. Software Eng., vol. 25, no. 4, pp. 573-583, July/Aug. 1999.
[10] E. Chrysler, “Some Basic Determinants of Computer Programming Productivity,” Comm. ACM, vol. 21, no. 6, pp. 472-483, 1978.
[11] R.E. Fairley, “Recent Advances in Software Estimation Techniques,” Proc. Int'l Conf. Software Eng., pp. 382-391, 1992.
[12] N.E. Fenton, W. Marsh, M. Neil, P. Cates, S. Forey, and M. Tailor, “Making Resource Decisions for Software Projects,” Proc. 26th Int'l Conf. Software Eng., pp. 397-406, 2004.
[13] N.E. Fenton, P. Krause, and M. Neil, “Software Measurement: Uncertainty and Causal Modeling,” IEEE Software, vol. 10, no. 4, pp. 116-122, July/Aug. 2002.
[14] N.E. Fenton and M. Neil, “Software Metrics: Roadmap,” The Future of Software Eng., A. Finkelstein, ed., pp. 357-370, 2000.
[15] N.E. Fenton and M. Neil, “Software Metrics: Successes, Failures and New Directions,” J. Systems and Software, vol. 47, nos. 2-3, pp. 149-157, 1999.
[16] N. Fenton and S. Pfleeger, Software Metrics: A Rigorous & Practical Approach. PWS Publishing, 1997.
[17] A.R. Gray, S.G. MacDonnel, and M.J. Shepperd, “Factors Systematically Associated with Errors in Subjective Estimates of Software Development Effort: the Stability of Expert Judgment,” Proc. Sixth Int'l Software Metrics Symp., pp. 216-227, 1991.
[18] T.E. Hastings and A.S.M. Sajeev, “A Vector-Based Approach to Software Size Measurement and Effort Estimation,” IEEE Trans. Software Eng., vol. 27, no. 4, pp. 337-350, 2001.
[19] D. Heckerman, ”Bayesian Networks for Data Mining,” Data Mining and Knowledge Discovery, vol. 1, pp. 79-119, 1997.
[20] Q. Hu, R. Plant, and D. Hertz, “Software Cost Estimation Using Economic Production Models,” J. Management Information Systems, vol. 15, no. 1, pp. 143-163, 1998.
[21] P.M. Johnson, C.A. Moore, J.A. Dane, and R.S. Brewer, “Empirically Guided Software Effort Guesstimation,” IEEE Software, pp. 51-56, 2000.
[22] C. Jones, “By Popular Demand: Software Estimating Rules of Thumb,” Computer, vol. 29, no. 3, p. 116, Mar. 1996.
[23] M. Jørgensen, “A Review of Studies on Expert Estimation of Software Development Effort,” J. Systems and Software, vol. 70, pp. 37-60, 2004.
[24] L.A. Laranjeira, “Software Size Estimation of Object-Oriented Systems,” IEEE Trans. Software Eng., vol. 16, no. 5, pp. 510-522, May 1990.
[25] M.A. Mahmood, K.J. Pettingell, and A.I. Shaskevich, “Measuring Productivity of Software Projects: A Data Envelopment Analysis Approach,” Decision Sciences, vol. 27, no. 1, pp. 57-80, 1996.
[26] J. Moses and J. Clifford, “Learning How to Improve Effort Estimation in Small Software Development Companies,” Proc. 24th Ann. Int'l Computer Software and Applications Conf. (COMPSAC), pp. 522-527, 2000.
[27] P. Nesi and T. Querci, “Effort Estimation and Prediction of Object-oriented Systems,” J. Systems and Software, vol. 42, no. 1, pp. 89-102, 1998.
[28] J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. San Francisco, Calif.: Morgan-Kaufman, 1988.
[29] P.C. Pendharkar, “An Exploratory Study of Object-Oriented Software Component Size Determinants and the Application of Regression Tree Forecasting Models,” Information and Management, vol. 42, no. 1, pp. 61-73, 2004.
[30] P.C. Pendharkar and S. Nanda, “A Misclassification Cost Minimizing Evolutionary-Neural Classification Approach,” Working Paper Series, Working paper #03-6, School of Business Administration, Pennsylvania State Univ. at Harrisburg, 2004.
[31] P.C. Pendharkar and G.H. Subramanian, “Connectionist Models for Learning, Discovering, and Forecasting Software Effort: An Empirical Study,” J. Computer Information Systems, vol. 43, no. 1, pp. 7-14, 2002.
[32] R.S. Pressman, Software Engineering: A Practitioner's Approach. McGraw-Hill, 2001.
[33] T.L. Saaty and L.G. Vargas, “Diagnosis with Dependent Symptoms: Bayes Theorem and the Analytic Hierarchy Process,” Operations Research, vol. 46, no. 4, pp. 491-502, 1998.
[34] I. Stamelos, L. Angelis, and E. Sakellaris, “On the Use of Bayesian Belief Networks for the Prediction of Software Productivity,” Information and Software Technology, vol. 45, no. 1, pp. 51-60, 2003.
[35] G.H. Subramanian and G. Zarnich, “An Examination of Some Software Development Effort and Productivity Determinants in ICASE Tool Projects,” J. Management Information Systems, vol. 12, no. 4, pp. 143-160, 1996.
[36] O. Varis, “A Belief Network Approach to Optimization and Parameter Estimation: Application to Resource and Environmental Management,” Artificial Intelligence, vol. 101, no. 1-2, pp. 135-163, 1998.
[37] C.D. Wrigley and A.S. Dexter, “A Model of Measuring Information System Size,” MIS Quaterly, vol. 15, no. 2, pp. 245-257, 1991.

Index Terms:
Index Terms- Bayesian belief networks, software effort estimation, probability theory.
Parag C. Pendharkar, Girish H. Subramanian, James A. Rodger, "A Probabilistic Model for Predicting Software Development Effort," IEEE Transactions on Software Engineering, vol. 31, no. 7, pp. 615-624, July 2005, doi:10.1109/TSE.2005.75
Usage of this product signifies your acceptance of the Terms of Use.