2009 Ninth IEEE International Conference on Data Mining Unsupervised Class Separation of Multivariate Data through Cumulative Variance-Based Ranking Miami, Florida December 06-December 09 ISBN: 978-0-7695-3895-2
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDM.2009.17
This paper introduces a new extension of outlier detection approaches and a new concept, class separation through variance. We show that accumulating information about the outlierness of points in multiple subspaces leads to a ranking in which classes with differing variance naturally tend to separate. Exploiting this leads to a highly effective and efficient unsupervised class separation approach, especially useful in the difficult case of heavily overlapping distributions. Unlike typical outlier detection algorithms, this method can be applied beyond the `rare classes' case with great success. Two novel algorithms that implement this approach are provided. Additionally, experiments show that the novel methods typically outperform other state-of-the-art outlier detection methods on high dimensional data such as Feature Bagging, SOE1, LOF, ORCA and Robust Mahalanobis Distance and competes even with the leading supervised classification methods.
Index Terms:
Outlier Detection, Classification, Subspaces
Citation:
Andrew Foss, Osmar R. Zaïane, Sandra Zilles, "Unsupervised Class Separation of Multivariate Data through Cumulative Variance-Based Ranking," icdm, pp.139-148, 2009 Ninth IEEE International Conference on Data Mining, 2009 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||