This Article 
 Bibliographic References 
 Add to: 
A Two-Stage Linear Discriminant Analysis via QR-Decomposition
June 2005 (vol. 27 no. 6)
pp. 929-941
Jieping Ye, IEEE
Linear Discriminant Analysis (LDA) is a well-known method for feature extraction and dimension reduction. It has been used widely in many applications involving high-dimensional data, such as image and text classification. An intrinsic limitation of classical LDA is the so-called singularity problems; that is, it fails when all scatter matrices are singular. Many LDA extensions were proposed in the past to overcome the singularity problems. Among these extensions, PCA+LDA, a two-stage method, received relatively more attention. In PCA+LDA, the LDA stage is preceded by an intermediate dimension reduction stage using Principal Component Analysis (PCA). Most previous LDA extensions are computationally expensive, and not scalable, due to the use of Singular Value Decomposition or Generalized Singular Value Decomposition. In this paper, we propose a two-stage LDA method, namely LDA/QR, which aims to overcome the singularity problems of classical LDA, while achieving efficiency and scalability simultaneously. The key difference between LDA/QR and PCA+LDA lies in the first stage, where LDA/QR applies QR decomposition to a small matrix involving the class centroids, while PCA+LDA applies PCA to the total scatter matrix involving all training data points. We further justify the proposed algorithm by showing the relationship among LDA/QR and previous LDA methods. Extensive experiments on face images and text documents are presented to show the effectiveness of the proposed algorithm.

[1] G. Baudat and F. Anouar, “Generalized Discriminant Analysis Using a Kernel Approach,” Neural Computation, vol. 12, no. 10, pp. 2385-2404, 2000.
[2] P.N. Belhumeour, J.P. Hespanha, and D.J. Kriegman, “Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 711-720, July 1997.
[3] H. Cevikalp, M. Neamtu, M. Wilkes, and A. Barkana, “Discriminative Common Vectors for Face Recognition,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 1, pp. 4-13, Jan. 2005.
[4] S. Chakrabarti, S. Roy, and M. Soundalgekar, “Fast and Accurate Text Classification via Multiple Linear Discriminant Projections,” Very Large Databases J., vol. 12, no. 2, pp. 170-185, 2003.
[5] L.F. Chen, H.Y.M. Liao, J.C. Lin, M.D. Kao, and G.J. Yu, “A New LDA-Based Face Recognition System which Can Solve the Small Sample Size Problem,” Pattern Recognition, vol. 33, no. 10, pp. 1713-1726, 2000.
[6] R.O. Duda, P.E. Hart, and D. Stork, Pattern Classification. Wiley, 2000.
[7] S. Dudoit, J. Fridlyand, and T.P. Speed, “Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data,” J. Am. Statistical Assoc., vol. 97, no. 457, pp. 77-87, 2002.
[8] J.H. Friedman, “Regularized Discriminant Analysis,” J. Am. Statistical Assoc., vol. 84, no. 405, pp. 165-175, 1989.
[9] K. Fukunaga, Introduction to Statistical Pattern Classification. San Diego, Calif.: Academic Press, 1990.
[10] G.H. Golub and C.F. Van Loan, Matrix Computations, third ed. The Johns Hopkins Univ. Press, 1996.
[11] P. Howland, M. Jeon, and H. Park, “Structure Preserving Dimension Reduction for Clustered Text Data Based on the Generalized Singular Value Decomposition,” SIAM J. Matrix Analysis and Applications, vol. 25, no. 1, pp. 165-179, 2003.
[12] W.-S. Hwang and J. Weng, “Hierarchical Discriminant Regression,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 11, pp. 1277-1293, Nov. 2000.
[13] H. Jin, B.C. Ooi, H.T. Shen, C. Yu, and A.Y. Zhou, “An Adaptive and Efficient Dimensionality Reduction Algorithm for High-Dimensionality Indexing,” Proc. Int'l Conf. Data Eng., pp. 87-98, 2003.
[14] I.T. Jolliffe, Principal Component Analysis. Springer-Verlag, 1986.
[15] K.V.R. Kanth, D. Agrawal, A.E. Abbadi, and A. Singh, “Dimensionality Reduction for Similarity Searching in Dynamic Databases,” Computer Vision and Image Understanding: CVIU, vol. 75, nos. 1-2, pp. 59-72, 1999.
[16] W.J. Krzanowski, P. Jonathan, W.V. McCarthy, and M.R. Thomas, “Discriminant Analysis with Singular Covariance Matrices: Methods and Applications to Spectroscopic Data,” Applied Statistics, vol. 44, pp. 101-115, 1995.
[17] S. Kumar, J. Ghosh, and M.M. Crawford, “Hierarchical Fusion of Multiple Classifiers for Hyperspectral Data Analysis,” Pattern Analysis and Applications, vol. 5, no. 2, pp. 210-220, 2002.
[18] D.D. Lewis, “Reuters-21578 Text Categorization Test Collection Distribution 1.0,”, 1999.
[19] C. Liu and H. Wechsler, “Enhanced Fisher Linear Discriminant Models for Face Recognition,” Proc. Int'l Conf. Pattern Recognition, pp. 1368-1372, 1998.
[20] J. Lu, K.N. Plataniotis, and A.N. Venetsanopoulos, “Face Recognition Using Kernel Direct Discriminant Analysis Algorithms,” IEEE Trans. Neural Networks, vol. 14, no. 1, pp. 117-126, 2003.
[21] A. Martinez and A. Kak, “PCA versus LDA,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, pp. 228-233, 2001.
[22] A.M. Martinez and R. Benavente, “The AR Face Database,” Technical Report No. 24, CVC, 1998.
[23] M.F. Porter, “An Algorithm for Suffix Stripping,” Program, vol. 14, no. 3, pp. 130-137, 1980.
[24] S. Raudys and R.P.W. Duin, “On Expected Classification Error of the Fisher Linear Classifier with Pseudoinverse Covariance Matrix,” Pattern Recognition Letters, vol. 19, nos. 5-6, pp. 385-392, 1998.
[25] G. Salton, Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley, 1989.
[26] B. Schökopf and A. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization and Beyond. MIT Press, 2002.
[27] D.L. Swets and J. Weng, “Using Discriminant Eigenfeatures for Image Retrieval,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 8, pp. 831-836, Aug. 1996.
[28] F.D.L. Torre and M. Black, “Robust Principal Component Analysis for Computer Vision,” Proc. Int'l Conf. Computer Vision, vol. I, pp. 362-369, 2001.
[29] TREC, Proc. Text Retrieval Conf., http:/, 1999.
[30] M. Turk and A. Pentland, “Eigenfaces for Recognition,” J. Cognitive Neuroscience, vol. 3, pp. 71-86, 1991.
[31] C.F. Van Loan, “Generalizing the Singular Value Decomposition,” SIAM J. Numerical Analysis, vol. 13, pp. 76-83, 1976.
[32] J. Ye, R. Janardan, C.H. Park, and H. Park, “An Optimization Criterion for Generalized Discriminant Analysis on Undersampled Problems,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 8, pp. 982-994, Aug. 2004.
[33] H. Yu and J. Yang, “A Direct LDA Algorithm for High-Dimensional Data with Applications to Face Recognition,” Pattern Recognition, vol. 34, pp. 2067-2070, 2001.
[34] Y. Zhao and G. Karypis, “Empirical and Theoretical Comparisons of Selected Criterion Functions for Document Clustering,” Machine Learning, vol. 55, no. 3, pp. 311-331, 2004.

Index Terms:
Linear discriminant analysis, dimension reduction, QR decomposition, classification.
Jieping Ye, Qi Li, "A Two-Stage Linear Discriminant Analysis via QR-Decomposition," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 6, pp. 929-941, June 2005, doi:10.1109/TPAMI.2005.110
Usage of this product signifies your acceptance of the Terms of Use.