loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Ninth International Software Metrics Symposium (METRICS'03)
Dealing with Missing Software Project Data
Sydney, Australia
September 03-September 05
ISBN: 0-7695-1987-3
M. H. Cartwright, Bournemouth University
M. J. Shepperd, Bournemouth University
Q. Song, Bournemouth University
Whilst there is a general consensus that quantitative approaches are an important part of successful software project management, there has been relatively little research into many of the obstacles to data collection and analysis in the real world. One feature that characterises many of the data sets we deal with is missing or highly questionable values. Naturally this problem is not unique to software engineering, so in this paper we explore the application of two existing data imputation techniques that have been used to good effect elsewhere. In order to assess the potential value of imputation we use two industrial data sets. Both are quite problematic from an effort modelling perspective because they contain few cases, have a significant number of missing values and the projects are quite heterogeneous. The question we pose is can imputation help? To answer we examine the quality of fit of effort models derived by stepwise regression on the raw data and data sets with values imputed by various techniques is compared. In both data sets we find that k-Nearest Neighbour (k-NN) and sample mean imputation (SMI) significantly improve the model fit, with k-NN giving the best results. These results are consistent with other recently published results, consequently we conclude that imputation can assist empirical software engineering.
Index Terms:
project effort estimation, imputation, data analysis
Citation:
M. H. Cartwright, M. J. Shepperd, Q. Song, "Dealing with Missing Software Project Data," metrics, pp.154, Ninth International Software Metrics Symposium (METRICS'03), 2003
Usage of this product signifies your acceptance of the Terms of Use.