Subscribe
Issue No.12 - Dec. (2013 vol.35)
pp: 3050-3065
Delin Chu , Dept. of Math., Nat. Univ. of Singapore, Singapore, Singapore
Li-Zhi Liao , Dept. of Math., Hong Kong Baptist Univ., Kowloon, China
Michael K. Ng , Dept. of Math., Hong Kong Baptist Univ., Kowloon, China
Xiaowei Zhang , Dept. of Math., Nat. Univ. of Singapore, Singapore, Singapore
ABSTRACT
In this paper, we study canonical correlation analysis (CCA), which is a powerful tool in multivariate data analysis for finding the correlation between two sets of multidimensional variables. The main contributions of the paper are: 1) to reveal the equivalent relationship between a recursive formula and a trace formula for the multiple CCA problem, 2) to obtain the explicit characterization for all solutions of the multiple CCA problem even when the corresponding covariance matrices are singular, 3) to develop a new sparse CCA algorithm, and 4) to establish the equivalent relationship between the uncorrelated linear discriminant analysis and the CCA problem. We test several simulated and real-world datasets in gene classification and cross-language document retrieval to demonstrate the effectiveness of the proposed algorithm. The performance of the proposed method is competitive with the state-of-the-art sparse CCA algorithms.
INDEX TERMS
Sparse matrices, Orthogonality, Canonical correlation analysis, Data models,linear discriminant analysis, Sparsity, orthogonality, multivariate data, canonical correlation analysis
CITATION
Delin Chu, Li-Zhi Liao, Michael K. Ng, Xiaowei Zhang, "Sparse Canonical Correlation Analysis: New Formulation and Algorithm", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.35, no. 12, pp. 3050-3065, Dec. 2013, doi:10.1109/TPAMI.2013.104
REFERENCES
 [1] T.W. Anderson, An Introduction to Multivariate Statistical Analysis, third ed. John Wiley & Sons, 2003. [2] F.R. Bach and M.I. Jordan, "A Probabilistic Interpretation of Canonical Correlation Analysis," Technical Report 688, Dept. of Statistics, Univ. of California, Berkeley, 2005. [3] F.R. Bach and M.I. Jordan, "Kernel Independent Component Analysis," J. Machine Learning Research, vol. 3, pp. 1-48, 2003. [4] C.M. Bishop, Pattern Recognition and Machine Learning. Springer, 2006. [5] Å. Björck and G.H. Golub, "Numerical Methods for Computing Angles between Linear Subspaces," Math. Computation, vol. 27, no. 123, pp. 579-594, 1973. [6] A.P. Bradley, "The Use of the Area under the ROC Curve in the Evaluation of Machine Learning Algorithms," Pattern Recognition, vol. 30, no. 7, pp. 1145-1159, 1997. [7] J.F. Cai, S. Osher, and Z.W. Shen, "Convergence of the Linearized Bregman Iteration for $\ell_1$ -Norm Minimization," CAM Report (08-52), Univ. of California Los Angeles, 2008. [8] D. Chu, S.T. Goh, and Y.S. Hung, "Characterization of All Solutions for Undersampled Uncorrelated Linear Discriminant Analysis Problems," SIAM J. Matrix Analysis Applications, vol. 32, pp. 820-844, 2011. [9] M. Dettling, "BagBoosting for Tumor Classification with Gene Expression Data," Bioinformatics, vol. 20, no. 18, pp. 3583-3593, 2004. [10] T. Fawcett, "An Introduction to ROC Analysis," Pattern Recognition Letters, vol. 27, no. 8, pp. 861-874, 2006. [11] P. Baldi and G.W. Hatfield, DNA Microarrays and Gene Expression: From Experiments to Data Analysis and Modeling. Cambridge Univ. Press, 2002. [12] J.H. Friedman, "Regularized Discriminant Analysis," J. Am. Statistical Assoc., vol. 84, no. 405, pp. 165-175, 1989. [13] O. Friman, J. Cedefamn, P. Lundberg, M. Borga, and H. Knutsson, "Detection of Neural Activity in Functional MRI Using Canonical Correlation Analysis," Magnetic Resonance in Medicine, vol. 45, pp. 323-330, 2001. [14] K. Fukunaga, Introduction to Statistical Pattern Recognition, second ed. Academic Press, 1990. [15] U. Germann, "Aligned Hansards of the 36th Parliament of Canada," http://www.isi.edu/natural-language/download hansard/, 2013. [16] D.R. Hardoon and J.R. Shawe-Tayler, "Sparse Canonical Correlation Analysis," Machine Learning J., vol. 83, no. 3, pp. 331-353, 2011. [17] D.R. Hardoon and J.R. Shawe-Taylor, "Sparse Canonical Correlation Analysis," technical report, Dept. of Computer Science, Univ. College London, 2007. [18] D.R. Hardoon and J.R. Shawe-Taylor, "Convergence Analysis of Kernel Canonical Correlation Analysis: Theory and Practice," Machine Learning, vol. 74, no. 1, pp. 23-38, 2009. [19] D.R. Hardoon, S.R. Szedmak, and J.R. Shawe-Taylor, "Canonical Correlation Analysis: An Overview with Application to Learning Methods," Neural Computation, vol. 16, no. 12, pp. 2639-2664, 2004. [20] T. Hastie, A. Buja, and R. Tibshirani, "Penalized Discriminant Analysis," The Annals of Statistics, vol. 23, no. 1, pp. 73-102, 1995. [21] H. Hotelling, "Relations between Two Sets of Variables," Biometrika, vol. 28, pp. 321-377, 1936. [22] H. Hotelling, "Analysis of a Complex of Statistical Variables into Principal Components," J. Educational Psychology, vol. 24, pp. 417-441, 1933. [23] P. Howland and H. Park, "Generalizing Discriminant Analysis Using the Generalized Singular Value Decomposition," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 8, pp. 995-1006, Aug. 2004. [24] J.R. Kettenring, "Canonical Analysis of Several Sets of Variables," Biometrika, vol. 58, no. 3, pp. 433-451, 1971. [25] G. Kowalski and M. Maybury, Information Storage and Retrieval Systems: Theory and Implementation, second ed. Kluwer Academic Publishers, 1986. [26] M. Lai and W. Yin, "Augmented $\ell_{1}$ and Nuclear-Norm Models with a Globally Linearly Convergent Algorithm," CAAM Technical Report TR12-02, Rice Univ., 2012. [27] P. Lai and C. Fyfe, "Kernel and Nonlinear Canonical Correlation Analysis," Int'l J. Neural Systems, vol. 10, pp. 365-374, 2001. [28] G. Salton and M. McGill, Introduction to Modern Information Retrieval. McGraw-Hill, 1986. [29] J.R. Shawe-Taylor and N. Cristianini, Kernel Methods for Pattern Analysis. Cambridge Univ. Press, 2004. [30] B. Sriperumbudur, D. Torres, and G. Lanckriet, "A Majorization-Minimization Approach to the Sparse Generalized Eigenvalue Problem," Machine Learning, vol. 85, pp. 3-39, 2011. [31] L. Sun, S. Ji, and J. Ye, "A Least Squares Formulation for Canonical Correlation Analysis," Proc. 25th Int'l Conf. Machine Learning, pp. 1024-1031, 2008. [32] L. Sun, S. Ji, and J. Ye, "Canonical Correlation Analysis for Multi-Label Classification: A Least Squares Formulation, Extensions and Analysis," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 33, no. 1, pp. 194-200, Jan. 2011. [33] A. Vinokourov, J.R. Shawe-Taylor, and N. Cristianini, "Inferring a Semantic Representation of Text via Cross-Language Correlation Analysis," Proc. Advances in Neural Information Processing Systems 15, S. Becker, S. Thrun, and K. Obermayer, eds., 2003. [34] S. Waaijenborg, P.C.V. de Witt Hamer, and A.H. Zwinderman, "Quantifying the Association between Gene Expressions and DNA-Markers by Penalized Canonical Correlation Analysis," Statistical Applications in Genetics and Molecular Biology, vol. 7, no. 1, article 3, 2008. [35] D.M. Witten and R. Tibshirani, "Extensions of Sparse Canonical Correlation Analysis with Applications to Genomic Data," Statistical Applications in Genetics and Molecular Biology, vol. 8, no. 1, article 28, 2009. [36] D.M. Witten, R. Tibshirani, and T. Hastie, "A Penalized Matrix Decomposition, with Applications to Sparse Principal Components and Canonical Correlation Analysis," Biostatistics, vol. 10, no. 3, pp. 515-534, 2009. [37] K.J. Worsley, J.b. Poline, K.J. Friston, and A.C. Evans, "Characterizing the Response of PET and fMRI Data Using Multivariate Linear Models," NeuroImage, vol. 6, no. 4, pp. 305-319, 1997. [38] Y. Yamanishi, J.P. Vert, A. Nakaya, and M. Kanehisa, "Extraction of Correlated Gene Clusters from Multiple Genomic Data by Generalized Kernel Canonical Correlation Analysis," Bioinformatics, vol. 19, no. Suppl. 1, pp. i323-i330, 2003. [39] J. Ye, "Characterization of a Family of Algorithms for Generalized Discriminant Analysis on Undersampled Problems," J. Machine Learning Research, vol. 6, pp. 483-502, 2005. [40] W. Yin, "Analysis and Generalizations of the Linearized Bregman Method," SIAM J. Imaging Sciences, vol. 3, no. 4, pp. 856-877, 2010. [41] W. Yin, S. Osher, D. Goldfarb, and J. Darbon, "Bregman Iterative Algorithms for $\ell_1$ -Minimization with Applications to Compressed Sensing," SIAM J. Imaging Sciences, vol. 1, no. 1, pp. 43-168, 2008. [42] H. Zou and T. Hastie, "Regularization and Variable Selection via the Elastic Net," J. Royal Statistical Soc., Series B, vol. 67, pp. 301-320, 2005.