Subscribe

Issue No.04 - April (2013 vol.39)

pp: 537-551

Nikolaos Mittas , Aristotle University of Thessaloniki, Thessaloniki

Lefteris Angelis , Aristotle University of Thessaloniki, Thessaloniki

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TSE.2012.45

ABSTRACT

Software Cost Estimation can be described as the process of predicting the most realistic effort required to complete a software project. Due to the strong relationship of accurate effort estimations with many crucial project management activities, the research community has been focused on the development and application of a vast variety of methods and models trying to improve the estimation procedure. From the diversity of methods emerged the need for comparisons to determine the best model. However, the inconsistent results brought to light significant doubts and uncertainty about the appropriateness of the comparison process in experimental studies. Overall, there exist several potential sources of bias that have to be considered in order to reinforce the confidence of experiments. In this paper, we propose a statistical framework based on a multiple comparisons algorithm in order to rank several cost estimation models, identifying those which have significant differences in accuracy, and clustering them in nonoverlapping groups. The proposed framework is applied in a large-scale setup of comparing 11 prediction models over six datasets. The results illustrate the benefits and the significant information obtained through the systematic comparison of alternative methods.

INDEX TERMS

Predictive models, Estimation, Accuracy, Measurement uncertainty, Prediction algorithms, Clustering algorithms, Systematics, statistical methods, Cost estimation, management, metrics/measurement

CITATION

Nikolaos Mittas, Lefteris Angelis, "Ranking and Clustering Software Cost Estimation Models through a Multiple Comparisons Algorithm",

*IEEE Transactions on Software Engineering*, vol.39, no. 4, pp. 537-551, April 2013, doi:10.1109/TSE.2012.45REFERENCES

- [1] M. Jorgensen and M. Shepperd, "A Systematic Review of Software Development Cost Estimation Studies,"
IEEE Trans. Software Eng., vol. 33, no. 1, pp. 33-53, Jan. 2007.- [2] M. Shepperd and G. Kadoda, "Comparing Software Prediction Techniques Using Simulation,"
IEEE Trans. Software Eng., vol. 27, no. 11, pp. 1014-1022, Nov. 2001.- [3] B. Kitchenham, S. MacDonell, L. Pickard, and M. Shepperd, "What Accuracy Statistics Really Measure,"
IEE Proc. Software Eng., vol. 148, pp. 81-85, June 2001.- [4] T. Foss, E. Stensrud, B. Kitchenham, and I. Myrtveit, "A Simulation Study of the Model Evaluation Criterion MMRE,"
IEEE Trans. Software Eng., vol. 29, no. 11, pp. 985-995, Nov. 2003.- [5] N. Mittas and L. Angelis, "Comparing Cost Prediction Models by Resampling Techniques,"
J. Systems and Software, vol. 81, no. 5, pp. 616-632, May 2008.- [6] E. Stensrud and I. Myrtveit, "Human Performance Estimating with Analogy and Regression Models: An Empirical Validation,"
Proc. IEEE Fifth Int'l Software Metrics Symp., pp. 205-213, Nov. 1998.- [7] B. Kitchenham and E. Mendes, "Why Comparative Effort Prediction Studies May Be Invalid,"
Proc. ACM Fifth Int'l Conf. Predictor Models in Software Eng., pp. 1-5, May 2009.- [8] I. Myrtveit, E. Stensrud, and M. Shepperd, "Reliability and Validity in Comparative Studies of Software Prediction Models,"
IEEE Trans. Software Eng., vol. 31, no. 5, pp. 380-391, May 2005.- [9] S. Lessmann, B. Baesens, C. Mues, and S. Pietsch, "Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings,"
IEEE Trans. Software Eng., vol. 34, no. 4, pp. 485-496, July/Aug. 2008.- [10] J. Antony,
Design of Experiments for Engineers and Scientists. Butterworth-Heinenmann, 2003.- [11] A. Scott and M. Knott, "A Cluster Analysis Method for Grouping Means in the Analysis of Variance,"
Biometrics, vol. 30, no. 3, pp. 507-512, Sept. 1974.- [12] J. Sayyad Shirabad and T. Menzies, "The PROMISE Repository of Software Engineering Databases," School of Information Technology and Eng., Univ. of Ottawa, http://promise.site.uottawa.caSERepository . 2005.
- [13] ISBSG Data Set 10, http:/www.isbsg.org. 2007.
- [14] Y. Miyazaki, M. Terakado, K. Ozaki, and H. Nozaki, "Robust Regression for Developing Software Estimation Models,"
J. Systems and Software, vol. 27, pp. 3-16, 1994.- [15] T. Menzies, O. Jalali, J. Hihn, D. Baker, and K. Lum, "Stable Rankings for Different Effort Models,"
Automated Software Eng., vol. 17, no. 4, pp. 409-437, Dec. 2010.- [16] G. Tsoumakas, L. Angelis, and I. Vlahavas, "Selective Fusion of Heterogeneous Classifiers,"
Intelligent Data Analysis, vol. 9, no. 6, pp. 511-525, Dec. 2005.- [17] J. Demšar, "Statistical Comparisons of Classifiers over Multiple Data Sets,"
J. Machine Learning Research, vol. 7, pp. 1-30, 2006.- [18] L. Briand, T. Langley, and I. Wieczorek, "A Replicated Assessment and Comparison of Common Software Cost Modeling Techniques,"
Proc 22nd. IEEE Int'l Conf. Software Eng., pp. 377-386, 2000.- [19] B. Kitchenham, S. Pfleeger, B. McColl, and S. Eagan, "An Empirical Study of Maintenance and Development Accuracy,"
J. Systems and Software, vol. 64, no. 1, pp. 57-77, Oct. 2002.- [20] B. Kitchenham, E. Mendes, and H. Travassos, "Cross versus Within-Company Cost Estimation Studies: A Systematic Review,"
IEEE Trans. Software Eng., vol. 33, no. 5, pp. 316-329, May 2007.- [21] R. Miller,
Simultaneous Statistical Inference, second ed. McGraw-Hill, 1981.- [22] D. Sheskin,
Handbook of Parametric and Nonparametric Statistical Procedures, third ed. Chapman & Hall/CRC, 2004.- [23] Y. Hochberg and A. Tamhane,
Multiple Comparison Procedures. Wiley & Sons, 1987.- [24] E. Da Silva, D. Ferreira, and E. Bearzoti, "Evaluation of Power and Type I Error Rates of Scott-Knott's Test by the Method of Monte Carlo,"
Ciências Agrotécnicas, vol. 23, pp. 687-696, 1999.- [25] L. Borges and D. Ferreira, "Power and Type I Errors Rate of Scott-Knott, Tukey and Newman-Keuls Tests under Normal and No-Normal Distributions of the Residues,"
Revista de Matemática e Estatística, vol. 21, no. 1, pp. 67-83, 2003.- [26] E. Jelihovschi and J. Faria, "ScottKnott: A Package for Performing the Scott-Knott Clustering Algorithm in R,"
The R J., article in press. - [27] E. Ferreira, A. Dusi, J. Costa, G. Xavier, and N. Rumjanek, "Assessing Insecticide and Fungicide Effects on the Culturable Soil Bacterial Community by Analyses of Variance of Their DGGE Fingerprinting Data,"
European J. Soil Biology, vol. 45, nos. 5-6, pp. 466-472, Sept.-Dec. 2009.- [28] M. Ferreira, F. De Araujo, D. Costa, P. Rosa, H. Figueiredo, and L. Murgas, "Influence of Dietary Oil Sources on Muscle Composition and Plasma Lipoprotein Concentrations in Nile Tilapia, Oreochromis Niloticus,"
J. World Aquaculture Soc., vol. 42, no. 1, pp. 24-33, Feb. 2011.- [29] D. Montgomery,
Design and Analysis of Experiments. John Wiley & Sons, 1991.- [30] G. Blom,
Statistical Estimates and Transformed Beta. Wiley, 1958.- [31] A. Abran and P. Robillard, "Function Points Analysis: An Empirical Study of Its Measurement Processes,"
IEEE Trans. Software Eng., vol. 22, no. 12, pp. 895-910, Dec. 1996.- [32] A. Gray and S. MacDonell, "Software Metrics Data Analysis—Exploring the Relative Performance of Some Commonly Used Modeling Techniques,"
Empirical Software Eng., vol. 4, no. 4, pp. 297-316, Dec. 1999.- [33] R. Jeffery, M. Ruhe, and I. Wieczorek, "Using Public Domain Metrics to Estimate Software Development Effort,"
Proc. Seventh Int'l Software Metrics Symp., pp. 16-27, Apr. 2001.- [34] M. Shepperd and C. Schofield, "Estimating Software Project Effort Using Analogies,"
IEEE Trans. Software Eng., vol. 23, no. 11, pp. 736-743, Nov. 1997.- [35] E. Mendes, I. Watson, C. Triggs, N. Mosley, and S. Counsell, "A Comparative Study of Cost Estimation Models for Web Hypermedia Applications,"
Empirical Software Eng., vol. 8, no. 2, pp. 163-196, June 2003.- [36] Z. Li, G. Ruhe, A. Al-Emran, and M. Richter, "A Flexible Method for Software Effort Estimation by Analogy,"
Empirical Software Eng., vol. 12, no. 1, pp. 65-106, Feb. 2007.- [37] N. Mittas, M. Athanasiades, and L. Angelis, "Improving Analogy-Based Software Cost Estimation by a Resampling Method,"
Information and Software Technology, vol. 50, no. 3, pp. 221-230, Feb. 2008.- [38] G. Finnie, G. Wittig, and J. Desharnais, "A Comparison of Software Effort Estimation Techniques: Using Function Points with Neural Networks, Case-Based Reasoning and Regression Models,"
J. Systems and Software, vol. 39, no. 3, pp. 281-289, Dec. 1997.- [39] P. Pendharkar, G. Subramanian, and J. Rodger, "A Probabilistic Model for Predicting Software Development Effort,"
IEEE Trans. Software Eng., vol. 31, no. 7, pp. 615-624, July 2005.- [40] L. Briand, V. Basili, and W. Thomas, "A Pattern Recognition Approach for Software Engineering Data Analysis,"
IEEE Trans. Software Eng., vol. 18, no. 11, pp. 931-942, Nov. 1992.- [41] L. Kaufman and P. Rousseeuw,
Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, 1990.- [42] M. Shepperd, C. Schofield, and B. Kitchenham, "Effort Estimation Using Analogy,"
Proc. 18th IEEE Int'l Conf. Software Eng, pp. 170-178, 1996.- [43] J. Keung, B. Kitchenham, and R. Jeffery, "Analogy-X: Providing Statistical Inference to Analogy-Based Software Cost Estimation,"
IEEE Trans. Software Eng., vol. 34, no. 4, pp. 471-484, July/Aug. 2008.- [44] P. Cortez, "Data Mining with Neural Networks and Support Vector Machines Using the R/rminer Tool,"
Advances in Data Mining Application and Theoretical Aspects, vol. 6171, pp. 572-583, 2010.- [45] L. Breiman, J. Friedman, R. Olshen, and C. Stone,
Classification and Regression Trees, Wadsworth Int'l Group, 1984.- [46] SPLUS 6 for Windows,
Guide to Statistics, 2. Insightful Corp., 2001.- [47] G. Liebchen and M. Shepperd, "Data Sets and Data Quality in Software Engineering,"
Proc. Fourth Int'l Workshop Predictor Models in Software Eng., pp. 39-44, May 2008.- [48] J. Keung, "Empirical Evaluation of Analogy-X for Software Cost Estimation,"
Proc. Second ACM-IEEE Int'l Symp. Empirical Software Eng. and Measurement, pp. 294-296, Oct. 2008.- [49] P. Ellis,
The Essential Guide to Effect Sizes: Statistical Power, Meta-Analysis, and the Interpretation of Research Results, Cambridge Univ. Press, 2010.- [50] B. Kitchenham, "The Question of Scale Economies in Software— Why Cannot Researchers Agree?"
Information and Software Technology, vol. 44, no. 1, pp. 13-24, Jan. 2002.- [51] B. Kitchenham, "A Procedure for Analyzing Unbalanced Data Sets,"
IEEE Trans. Software Eng., vol. 24, no. 4, pp. 278-301, Apr. 1998.- [52] T. Menzies and M. Shepperd, "Special Issue on Repeatable Results in Software Engineering Prediction,"
Empirical Software Eng., vol. 17, nos. 1/2, pp. 1-17, 2012.- [53] R. Kohavi, "A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection,"
Proc. 14th Int'l Joint Conf. Artificial Intelligence, pp. 1137-1145, 1995.- [54] B. Efron, "Estimating the Error Rate of a Prediction Rule Improvement on Cross-Validation,"
J. Am. Statistical Assoc., vol. 78, pp. 316-330, 1983. |