The Community for Technology Leaders
2007 IEEE 23rd International Conference on Data Engineering (2007)
Istanbul, Turkey
Apr. 15, 2007 to Apr. 20, 2007
ISBN: 1-4244-0802-4
pp: 1250-1254
Guimei Liu , National University of Singapore, Singapore, liugm@comp.nus.edu.sg
Jinyan Li , Institute for Infocomm Research, Singapore, jinyan@i2r.a-star.edu.sg
Kelvin Sim , Institute for Infocomm Research, Singapore, shsim@i2r.a-star.edu.sg
Limsoon Wong , National University of Singapore, Singapore, wongls@comp.nus.edu.sg
ABSTRACT
Traditional similarity or distance measurements usually become meaningless when the dimensions of the datasets increase, which has detrimental effects on clustering performance. In this paper, we propose a distance-based subspace clustering model, called nCluster, to find groups of objects that have similar values on subsets of dimensions. Instead of using a grid based approach to partition the data space into non-overlapping rectangle cells as in the density based subspace clustering algorithms, the nCluster model uses a more flexible method to partition the dimensions to preserve meaningful and significant clusters. We develop an efficient algorithm to mine only maximal nClusters. A set of experiments are conducted to show the efficiency of the proposed algorithm and the effectiveness of the new model in preserving significant clusters.
INDEX TERMS
null
CITATION

G. Liu, L. Wong, K. Sim and J. Li, "Distance Based Subspace Clustering with Flexible Dimension Partitioning," 2007 IEEE 23rd International Conference on Data Engineering(ICDE), Istanbul, Turkey, 2007, pp. 1250-1254.
doi:10.1109/ICDE.2007.368985
82 ms
(Ver 3.3 (11022016))