Issue No. 04 - October-December (2009 vol. 6)
Satoshi Niijima , Kyoto University, Kyoto
Yasushi Okuno , Kyoto University, Kyoto
Until recently, numerous feature selection techniques have been proposed and found wide applications in genomics and proteomics. For instance, feature/gene selection has proven to be useful for biomarker discovery from microarray and mass spectrometry data. While supervised feature selection has been explored extensively, there are only a few unsupervised methods that can be applied to exploratory data analysis. In this paper, we address the problem of unsupervised feature selection. First, we extend Laplacian linear discriminant analysis (LLDA) to unsupervised cases. Second, we propose a novel algorithm for computing LLDA, which is efficient in the case of high dimensionality and small sample size as in microarray data. Finally, an unsupervised feature selection method, called LLDA-based Recursive Feature Elimination (LLDA-RFE), is proposed. We apply LLDA-RFE to several public data sets of cancer microarrays and compare its performance with those of Laplacian score and SVD-entropy, two state-of-the-art unsupervised methods, and with that of Fisher score, a supervised filter method. Our results demonstrate that LLDA-RFE outperforms Laplacian score and shows favorable performance against SVD-entropy. It performs even better than Fisher score for some of the data sets, despite the fact that LLDA-RFE is fully unsupervised.
Unsupervised feature selection, linear discriminant analysis, graph Laplacian, microarray data analysis.
S. Niijima and Y. Okuno, "Laplacian Linear Discriminant Analysis Approach to Unsupervised Feature Selection," in IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 6, no. , pp. 605-614, 2007.