Issue No. 06 - June (2012 vol. 24)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2011.163
Partha Sarathi Bishnu , Birla Intitute of Technology, Ranchi
Vandana Bhattacherjee , Birla Intitute of Technology, Ranchi
Unsupervised techniques like clustering may be used for fault prediction in software modules, more so in those cases where fault labels are not available. In this paper a Quad Tree-based K-Means algorithm has been applied for predicting faults in program modules. The aims of this paper are twofold. First, Quad Trees are applied for finding the initial cluster centers to be input to the K-Means Algorithm. An input threshold parameter \delta governs the number of initial cluster centers and by varying \delta the user can generate desired initial cluster centers. The concept of clustering gain has been used to determine the quality of clusters for evaluation of the Quad Tree-based initialization algorithm as compared to other initialization techniques. The clusters obtained by Quad Tree-based algorithm were found to have maximum gain values. Second, the Quad Tree-based algorithm is applied for predicting faults in program modules. The overall error rates of this prediction approach are compared to other existing algorithms and are found to be better in most of the cases.
K-Means clustering, Quad Tree, software fault prediction.
P. S. Bishnu and V. Bhattacherjee, "Software Fault Prediction Using Quad Tree-Based K-Means Clustering Algorithm," in IEEE Transactions on Knowledge & Data Engineering, vol. 24, no. , pp. 1146-1150, 2011.