|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
Fourth IEEE International Conference on Data Mining (ICDM'04)
Subspace Selection for Clustering High-Dimensional Data
Brighton, United Kingdom
November 01-November 04
ISBN: 0-7695-2142-8
| ASCII Text | x | ||
| Christian Baumgartner, Claudia Plant, Karin Kailing, Hans-Peter Kriegel, Peer Kr?ger, "Subspace Selection for Clustering High-Dimensional Data," Data Mining, IEEE International Conference on, pp. 11-18, Fourth IEEE International Conference on Data Mining (ICDM'04), 2004. | |||
| BibTex | x | ||
| @article{ 10.1109/ICDM.2004.10112, author = {Christian Baumgartner and Claudia Plant and Karin Kailing and Hans-Peter Kriegel and Peer Kr?ger}, title = {Subspace Selection for Clustering High-Dimensional Data}, journal ={Data Mining, IEEE International Conference on}, volume = {0}, year = {2004}, isbn = {0-7695-2142-8}, pages = {11-18}, doi = {http://doi.ieeecomputersociety.org/10.1109/ICDM.2004.10112}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - CONF JO - Data Mining, IEEE International Conference on TI - Subspace Selection for Clustering High-Dimensional Data SN - 0-7695-2142-8 SP11 EP18 A1 - Christian Baumgartner, A1 - Claudia Plant, A1 - Karin Kailing, A1 - Hans-Peter Kriegel, A1 - Peer Kr?ger, PY - 2004 KW - null VL - 0 JA - Data Mining, IEEE International Conference on ER - | |||
In high-dimensional feature spaces traditional clustering algorithms tend to break down in terms of efficiency and quality. Nevertheless, the data sets often contain clusters which are hidden in various subspaces of the original feature space. In this paper, we present a feature selection technique called SURFING (SUbspaces Relevant For clusterING) that finds all subspaces interesting for clustering and sorts them by relevance. The sorting is based on a quality criterion for the interestingness of a subspace using the k-nearest neighbor distances of the objects. As our method is more or less parameterless, it addresses the unsupervised notion of the data mining task "clustering" in a best possible way. A broad evaluation based on synthetic and real-world data sets demonstrates that SURFING is suitable to find all relevant subspaces in high dimensional, sparse data sets and produces better results than comparative methods.
Citation:
Christian Baumgartner, Claudia Plant, Karin Kailing, Hans-Peter Kriegel, Peer Kr?ger, "Subspace Selection for Clustering High-Dimensional Data," icdm, pp.11-18, Fourth IEEE International Conference on Data Mining (ICDM'04), 2004
Usage of this product signifies your acceptance of the Terms of Use.
