Issue No. 07 - July (2009 vol. 21)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2008.236
Pawan Lingras , Saint Mary's University, Halifax
Min Chen , Tongji University, Shanghai
Duoqian Miao , Tongji University, Shanghai
Quality of clustering is an important issue in application of clustering techniques. Most traditional cluster validity indices are geometry-based cluster quality measures. This paper proposes a cluster validity index based on the decision-theoretic rough set model by considering various loss functions. Experiments with synthetic, standard, and real-world retail data show the usefulness of the proposed validity index for the evaluation of rough and crisp clustering. The measure is shown to help determine optimal number of clusters, as well as an important parameter called threshold in rough clustering. The experiments with a promotional campaign for the retail data illustrate the ability of the proposed measure to incorporate financial considerations in evaluating quality of a clustering scheme. This ability to deal with monetary values distinguishes the proposed decision-theoretic measure from other distance-based measures. The proposed validity index can also be extended for evaluating other clustering algorithms such as fuzzy clustering.
Cluster validity, decision theory, loss functions, rough-set-based clustering, k-means clustering.
D. Miao, M. Chen and P. Lingras, "Rough Cluster Quality Index Based on Decision Theory," in IEEE Transactions on Knowledge & Data Engineering, vol. 21, no. , pp. 1014-1026, 2008.