Machine Learning and Applications, Fourth International Conference on (2011)
Honolulu, Hawaii USA
Dec. 18, 2011 to Dec. 21, 2011
The problem of detecting clusters in high-dimensional data is increasingly common in machine learning applications, for instance in computer vision and bioinformatics. Recently, a number of approaches in the field of subspace clustering have been proposed which search for clusters in subspaces of unknown dimensions. Learning the number of clusters, the dimension of each subspace, and the correct assignments is a challenging task, and many existing algorithms often perform poorly in the presence of subspaces that have different dimensions and possibly overlap, or are otherwise computationally expensive. In this work we present a novel approach to subspace clustering that learns the numbers of clusters and the dimensionality of each subspace in an efficient way. We assume that the data points in each cluster are well represented in low-dimensions by a PCA model. We propose a measure of predictive influence of data points modelled by PCA which we minimise to drive the clustering process. The proposed predictive subspace clustering algorithm is assessed on both simulated data and on the popular Yale faces database where state-of-the-art performance and speed are obtained.
subspace clustering, PCA, predictive clustering
Brian McWilliams, Giovanni Montana, "Predictive Subspace Clustering", Machine Learning and Applications, Fourth International Conference on, vol. 01, no. , pp. 247-252, 2011, doi:10.1109/ICMLA.2011.117