Subscribe
Issue No.01 - January (2008 vol.20)
pp: 1-12
ABSTRACT
Linear Discriminant Analysis (LDA) has been a popular method for extracting features which preserve class separability. It has been widely used in many fields of information processing. However, the computation of LDA involves dense matrices eigen-decomposition which can be computationally expensive both in time and memory. Specifically, LDA has $O(mnt+t^3)$ time complexity and requires $O(mn+mt+nt)$ memory, where $m$ is the number of samples, $n$ is the number of features and $t=\min(m,n)$. When both $m$ and $n$ are large, it is infeasible to apply LDA. In this paper, we propose a novel algorithm for discriminant analysis, called {\em Spectral Regression Discriminant Analysis} (SRDA). By using spectral graph analysis, SRDA casts discriminant analysis into a regression framework which facilitates both efficient computation and the use of regularization techniques. Specifically, SRDA only needs to solve a set of regularized least squares problems and there is no eigenvector computation involved, which is a huge save of both time and memory. Our theoretical analysis shows that SRDA can be computed with $O(ms)$ time and $O(ms)$ memory, where $s (\leq n)$ is the average number of non-zero features in each sample. Extensive experimental results on four real world data sets demonstrate the effectiveness and efficiency of our algorithm.
INDEX TERMS
Data mining, Feature evaluation and selection
CITATION
Deng Cai, Xiaofei He, Jiawei Han, "SRDA: An Efficient Algorithm for Large-Scale Discriminant Analysis", IEEE Transactions on Knowledge & Data Engineering, vol.20, no. 1, pp. 1-12, January 2008, doi:10.1109/TKDE.2007.190669
REFERENCES
 [1] P.N. Belhumeur, J.P. Hepanha, and D.J. Kriegman, “Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 711-720, July 1997. [2] D. Cai, X. He, and J. Han, “Efficient Kernel Discriminant Analysis via Spectral Regression,” Proc. Int'l Conf. Data Mining (ICDM '07), 2007. [3] D. Cai, X. He, and J. Han, “Spectral Regression: A Unified Approach for Sparse Subspace Learning,” Proc. Int'l Conf. Data Mining (ICDM '07), 2007. [4] D. Cai, X. He, and J. Han, “Spectral Regression: A Unified Subspace Learning Framework for Content-Based Image Retrieval,” Proc. ACM Conf. Multimedia, 2007. [5] D. Cai, X. He, and J. Han, “Spectral Regression for Efficient Regularized Subspace Learning,” Proc. 11th Int'l Conf. Computer Vision (ICCV '07), 2007. [6] D. Cai, X. He, W.V. Zhang, and J. Han, “Regularized Locality Preserving Indexing via Spectral Regression,” Proc. 16th ACM Int'l Conf. Information and Knowledge Management (CIKM '07), 2007. [7] F.R.K. Chung, “Spectral Graph Theory,” CBMS Regional Conf. Series in Math., vol. 92, 1997. [8] R.O. Duda, P.E. Hart, and D.G. Stork, Pattern Classification, second ed. Wiley-Interscience, 2000. [9] J.H. Friedman, “Regularized Discriminant Analysis,” J. Am. Statistical Assoc., vol. 84, no. 405, pp. 165-175, 1989. [10] K. Fukunaga, Introduction to Statistical Pattern Recognition, second ed. Academic Press, 1990. [11] V. Gaede and O. Günther, “Multidimensional Access Methods,” ACM Computing Surveys, vol. 30, no. 2, pp. 170-231, 1998. [12] G.H. Golub and C.F.V. Loan, Matrix Computations, third ed. Johns Hopkins Univ. Press, 1996. [13] T. Hastie, A. Buja, and R. Tibshirani, “Penalized Discriminant Analysis,” Annals of Statistics, vol. 23, pp. 73-102, 1995. [14] T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, 2001. [15] X. He, S. Yan, Y. Hu, P. Niyogi, and H.-J. Zhang, “Face Recognition Using Laplacianfaces,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 3, pp. 328-340, Mar. 2005. [16] P. Howland and H. Park, “Generalizing Discriminant Analysis Using the Generalized Singular Value Decomposition,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 8, pp.995-1006, Aug. 2004. [17] K. Lang, “Newsweeder: Learning to Filter Netnews,” Proc. 12th Int'l Conf. Machine Learning (ICML '95), pp. 331-339, 1995. [18] C.C. Paige and M.A. Saunders, “Algorithm 583 LSQR: Sparse Linear Equations and Least Squares Problems,” ACM Trans. Math. Software, vol. 8, no. 2, pp. 195-209, June 1982. [19] C.C. Paige and M.A. Saunders, “LSQR: An Algorithm for Sparse Linear Equations and Sparse Least Squares,” ACM Trans. Math. Software, vol. 8, no. 1, pp. 43-71, Mar. 1982. [20] R. Penrose, “A Generalized Inverse for Matrices,” Proc. Cambridge Philosophical Soc., vol. 51, pp. 406-413, 1955. [21] G.W. Stewart, “Basic Decompositions,” Matrix Algorithms, vol. 1, SIAM, 1998. [22] G.W. Stewart, “Eigensystems,” Matrix Algorithms, vol. 2, SIAM, 2001. [23] D.L. Swets and J. Weng, “Using Discriminant Eigenfeatures for Image Retrieval,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 8, pp. 831-836, Aug. 1996. [24] K. Torkkola, “Linear Discriminant Analysis in Document Classification,” Proc. IEEE Int'l Conf. Data Mining Workshop Text Mining, 2001. [25] J. Ye, “Characterization of a Family of Algorithms for Generalized Discriminant Analysis on Undersampled Problems,” J. Machine Learning Research, vol. 6, pp. 483-502, 2005. [26] J. Ye, Q. Li, H. Xiong, H. Park, R. Janardan, and V. Kumar, “IDR/QR: An Incremental Dimension Reduction Algorithm via QR Decomposition,” Proc. 10th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (KDD '04), pp. 364-373, 2004. [27] J. Ye and T. Wang, “Regularized Discriminant Analysis for High Dimensional, Low Sample Size Data,” Proc. 12th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (KDD '06), pp. 454-463, 2006. [28] J. Ye, “Least Squares Linear Discriminant Analysis,” Proc. 24th Int'l Conf. Machine Learning (ICML '07), 2007.