The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.05 - September/October (2003 vol.15)
pp: 1316-1337
Dantong Yu , IEEE
ABSTRACT
<p><b>Abstract</b>—In this paper, we introduce the <em>ClusterTree</em>, a new indexing approach to representing clusters generated by any existing clustering approach. A cluster is decomposed into several subclusters and represented as the union of the subclusters. The subclusters can be further decomposed, which isolates the most related groups within the clusters. A <em>ClusterTree</em> is a hierarchy of clusters and subclusters which incorporates the cluster representation into the index structure to achieve effective and efficient retrieval. Our cluster representation is highly adaptive to any kind of cluster. It is well accepted that most existing indexing techniques degrade rapidly as the dimensions increase. The <em>ClusterTree</em> provides a practical solution to index clustered data sets and supports the retrieval of the nearest-neighbors effectively without having to linearly scan the high-dimensional data set. We also discuss an approach to dynamically reconstruct the <em>ClusterTree</em> when new data is added. We present the detailed analysis of this approach and justify it extensively with experiments.</p>
INDEX TERMS
Indexing, cluster representation, nearest-neighbor search, high-dimensional data sets.
CITATION
Dantong Yu, Aidong Zhang, "<em>ClusterTree</em>: Integration of Cluster Representation and Nearest-Neighbor Search for Large Data Sets with High Dimensions", IEEE Transactions on Knowledge & Data Engineering, vol.15, no. 5, pp. 1316-1337, September/October 2003, doi:10.1109/TKDE.2003.1232281
18 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool