Subscribe

Issue No.07 - July (2009 vol.21)

pp: 1014-1026

Min Chen , Tongji University, Shanghai

Pawan Lingras , Saint Mary's University, Halifax

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2008.236

ABSTRACT

Quality of clustering is an important issue in application of clustering techniques. Most traditional cluster validity indices are geometry-based cluster quality measures. This paper proposes a cluster validity index based on the decision-theoretic rough set model by considering various loss functions. Experiments with synthetic, standard, and real-world retail data show the usefulness of the proposed validity index for the evaluation of rough and crisp clustering. The measure is shown to help determine optimal number of clusters, as well as an important parameter called threshold in rough clustering. The experiments with a promotional campaign for the retail data illustrate the ability of the proposed measure to incorporate financial considerations in evaluating quality of a clustering scheme. This ability to deal with monetary values distinguishes the proposed decision-theoretic measure from other distance-based measures. The proposed validity index can also be extended for evaluating other clustering algorithms such as fuzzy clustering.

INDEX TERMS

Cluster validity, decision theory, loss functions, rough-set-based clustering, k-means clustering.

CITATION

Min Chen, Pawan Lingras, "Rough Cluster Quality Index Based on Decision Theory",

*IEEE Transactions on Knowledge & Data Engineering*, vol.21, no. 7, pp. 1014-1026, July 2009, doi:10.1109/TKDE.2008.236REFERENCES