Fourth IEEE International Conference on Data Mining (ICDM'04)
SCHISM: A New Approach for Interesting Subspace Mining
Brighton, United Kingdom
November 01-November 04
ISBN: 0-7695-2142-8
High-dimensional data pose challenges to traditional clustering algorithms due to their inherent sparsity and data tend to cluster in different and possibly overlapping subspaces of the entire feature space. Finding such subspaces is called subspace mining. We present SCHISM, a new algorithm for mining interesting subspaces, using the notions of support and Chernoff-Hoeffding bounds. We use a vertical representation of the dataset, and use a depth-first search with backtracking to find maximal interesting subspaces. We test our algorithm on a number of high-dimensional synthetic and real datasets to test its effectiveness.