Issue No. 04 - April (2008 vol. 20)
Semi-supervised clustering algorithms partition a given data set using limited supervision from the user. The success of these algorithms depend on the type of supervision and also on the kind of dissimilarity measure used while creating partitions of the space. This paper proposes a clustering algorithm that uses supervision in terms of relative comparisons, viz., x is closer to y than to z. The proposed clustering algorithm simultaneously learns the underlying dissimilarity measure while finding compact clusters in the given data set using relative comparisons. Through our experimental studies on high-dimensional textual data sets, we demonstrate that the proposed algorithm achieves higher accuracy and is more robust than similar algorithms using pairwise constraints for supervision.
Semi-supervised learning, Clustering, Dissimilarity Measures, Constraint-based Clustering
N. Kumar and K. Kummamuru, "Semisupervised Clustering with Metric Learning using Relative Comparisons," in IEEE Transactions on Knowledge & Data Engineering, vol. 20, no. , pp. 496-503, 2007.