This Article 
 Bibliographic References 
 Add to: 
Lanczos Vectors versus Singular Vectors for Effective Dimension Reduction
August 2009 (vol. 21 no. 8)
pp. 1091-1103
Jie Chen, University of Minnesota, Minneapolis
Yousef Saad, University of Minnesota, Minneapolis
This paper takes an in-depth look at a technique for computing filtered matrix-vector (mat-vec) products which are required in many data analysis applications. In these applications, the data matrix is multiplied by a vector and we wish to perform this product accurately in the space spanned by a few of the major singular vectors of the matrix. We examine the use of the Lanczos algorithm for this purpose. The goal of the method is identical with that of the truncated singular value decomposition (SVD), namely to preserve the quality of the resulting mat-vec product in the major singular directions of the matrix. The Lanczos-based approach achieves this goal by using a small number of Lanczos vectors, but it does not explicitly compute singular values/vectors of the matrix. The main advantage of the Lanczos-based technique is its low cost when compared with that of the truncated SVD. This advantage comes without sacrificing accuracy. The effectiveness of this approach is demonstrated on a few sample applications requiring dimension reduction, including information retrieval and face recognition. The proposed technique can be applied as a replacement to the truncated SVD technique whenever the problem can be formulated as a filtered mat-vec multiplication.

[1] C. Eckart and G. Young, “The Approximation of One Matrix by Another of Lower Rank,” Psychometrika, vol. 1, no. 3, pp.211-218, 1936.
[2] G.H. Golub and C.F. Van Loan, Matrix Computations. Johns Hopkins, 1996.
[3] S.C. Deerwester, S.T. Dumais, G.W. Furnas, T.K. Landauer, and R.A. Harshman, “Indexing by Latent Semantic Analysis,” J. Am. Soc. Information Science, vol. 41, no. 6, pp.391-407, 1990.
[4] M.W. Berry and M. Browne, Understanding Search Engines: Math. Modeling and Text Retrieval. SIAM, June 1999.
[5] M. Turk and A. Pentland, “Face Recognition Using Eigenfaces,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp.586-591, 1991.
[6] D.I. Witter and M.W. Berry, “Downdating the Latent Semantic Indexing Model for Conceptual Information Retrieval,” The Computer J., vol. 41, no. 8, pp.589-601, 1998.
[7] H. Zha and H.D. Simon, “On Updating Problems in Latent Semantic Indexing,” SIAM J. Scientific Computing, vol. 21, no. 2, pp.782-791, 1999.
[8] M. Brand, “Fast Low-Rank Modifications of the Thin Singular Value Decomposition,” Linear Algebra and Its Applications, vol. 415, no. 1, pp.20-30, 2006.
[9] J.E. Tougas and R.J. Spiteri, “Updating the Partial Singular Value Decomposition in Latent Semantic Indexing,” Computational Statistics and Data Anaysis, vol. 52, no. 1, pp.174-183, 2007.
[10] E. Kokiopoulou and Y. Saad, “Polynomial Filtering in Latent Semantic Indexing for Information Retrieval,” Proc. 27th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval, pp.104-111, 2004.
[11] J. Erhel, F. Guyomarc, and Y. Saad, “Least-Squares Polynomial Filters for Ill-Conditioned Linear Systems,” technical report, Univ. of Minnesota Supercomputing Inst., 2001.
[12] Y. Saad, “Filtered Conjugate Residual-Type Algorithms with Applications,” SIAM J. Matrix Analysis and Applications, vol. 28, no. 3, pp.845-870, Aug. 2006.
[13] M.W. Berry, “Large Scale Sparse Singular Value Computations,” Int'l J. Supercomputer Applications, vol. 6, no. 1, pp.13-49, 1992.
[14] K. Blom and A. Ruhe, “A Krylov Subspace Method for Information Retrieval,” SIAM J. Matrix Analysis and Applications, vol. 26, no. 2, pp.566-582, 2005.
[15] G. Golub and W. Kahan, “Calculating the Singular Values and Pseudo Inverse of a Matrix,” SIAM J. Numerical Analysis, vol. 2, no. 2, pp.205-224, 1965.
[16] Y. Saad, Numerical Methods for Large Eigenvalue Problems. Halstead Press, 1992.
[17] B.N. Parlett, The Symmetric Eigenvalue Problem. Prientice-Hall, 1998.
[18] R.M. Larsen, “Efficient Algorithms for Helioseismic Inversion,” PhD dissertation, Dept. of Computer Science, Univ. of Aarhus, Denmark, Oct. 1998.
[19] H.D. Simon, “Analysis of the Symmetric Lanczos Algorithm with Reorthogonalization Methods,” Linear Algebra Applications, vol. 61, pp.101-131, 1984.
[20] H.D. Simon, “The Lanczos Algorithm with Partial Reorthogonalization,” Math. Computation, vol. 42, no. 165, pp.115-142, 1984.
[21] D. Zeimpekis and E. Gallopoulos, TMG: A MATLAB Toolbox for Generating Term Document Matrices from Text Collections, pp.187-210, Springer, 2006.
[22] M.W. Berry, “Large-Scale Sparse Singular Value Computations,” Int'l J. Supercomputer Applications, vol. 6, no. 1, pp.13-49, 1992.
[23] F. Samaria and A. Harter, “Parameterisation of a Stochastic Model for Human Face Identification,” Proc. Second IEEE Workshop Applications of Computer Vision, 1994.
[24] T. Sim, S. Baker, and M. Bsat, “The CMU Pose, Illumination, and Expression Database,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 12, pp.1615-1618, Dec. 2003.
[25] A. Georghiades, P. Belhumeur, and D. Kriegman, “From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 6, pp.643-660, June 2001.
[26] K. Lee, J. Ho, and D. Kriegman, “Acquiring Linear Subspaces for Face Recognition Under Variable Lighting,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 5, pp.684-698, May 2005.
[27] Y. Saad, “On the Rates of Convergence of the Lanczos and the Block-Lanczos Methods,” SIAM J. Numerical Analysis, vol. 17, no. 5, pp.687-706, Oct. 1980.

Index Terms:
Dimension reduction, SVD, Lanczos algorithm, information retrieval, latent semantic indexing, face recognition, PCA, eigenfaces.
Jie Chen, Yousef Saad, "Lanczos Vectors versus Singular Vectors for Effective Dimension Reduction," IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 8, pp. 1091-1103, Aug. 2009, doi:10.1109/TKDE.2008.228
Usage of this product signifies your acceptance of the Terms of Use.