Sixth IEEE International Conference on Data Mining (ICDM'06)
COALA: A Novel Approach for the Extraction of an Alternate Clustering of High Quality and High Dissimilarity
Hong Kong
December 18-December 22
ISBN: 0-7695-2701-9
Cluster analysis has long been a fundamental task in data mining and machine learning. However, traditional clustering methods concentrate on producing a single solution, even though multiple alternative clusterings may exist. It is thus difficult for the user to validate whether the given solution is in fact appropriate, particularly for large and complex datasets. In this paper we explore the critical requirements for systematically finding a new clustering, given that an already known clustering is available and we also propose a novel algorithm, COALA, to discover this new clustering. Our approach is driven by two important factors; dissimilarity and quality. These are especially important for finding a new clustering which is highly informative about the underlying structure of data, but is at the same time distinctively different from the provided clustering. We undertake an experimental analysis and show that our method is able to outperform existing techniques, for both synthetic and real datasets.
Citation:
Eric Bae, James Bailey, "COALA: A Novel Approach for the Extraction of an Alternate Clustering of High Quality and High Dissimilarity," icdm, pp.53-62, Sixth IEEE International Conference on Data Mining (ICDM'06), 2006