The Community for Technology Leaders
RSS Icon
Issue No.11 - November (2009 vol.21)
pp: 1590-1603
Yi-Ren Yeh , National Taiwan University of Science and Technology, Taipei
Su-Yun Huang , Academia Sinica, Taipei
Yuh-Jye Lee , National Taiwan University of Science and Technology, Taipei
Sliced inverse regression (SIR) is a renowned dimension reduction method for finding an effective low-dimensional linear subspace. Like many other linear methods, SIR can be extended to nonlinear setting via the “kernel trick.” The main purpose of this paper is two-fold. We build kernel SIR in a reproducing kernel Hilbert space rigorously for a more intuitive model explanation and theoretical development. The second focus is on the implementation algorithm of kernel SIR for fast computation and numerical stability. We adopt a low-rank approximation to approximate the huge and dense full kernel covariance matrix and a reduced singular value decomposition technique for extracting kernel SIR directions. We also explore kernel SIR's ability to combine with other linear learning algorithms for classification and regression including multiresponse regression. Numerical experiments show that kernel SIR is an effective kernel tool for nonlinear dimension reduction and it can easily combine with other linear algorithms to form a powerful toolkit for nonlinear data analysis.
Dimension reduction, eigenvalue decomposition, kernel, reproducing kernel Hilbert space, singular value decomposition, sliced inverse regression, support vector machines.
Yi-Ren Yeh, Su-Yun Huang, Yuh-Jye Lee, "Nonlinear Dimension Reduction with Kernel Sliced Inverse Regression", IEEE Transactions on Knowledge & Data Engineering, vol.21, no. 11, pp. 1590-1603, November 2009, doi:10.1109/TKDE.2008.232
[1] E. Alpaydm, Introduction to Machine Learning. The MIT Press, 2004.
[2] R.D. Cook, Regression Graphics: Ideas for Studying Regressions through Graphics. John Wiley and Sons, 1998.
[3] K.C. Li, “Sliced Inverse Regression for Dimension Reduction (with Discussion),” J. Am. Statistical Assoc., vol. 86, pp. 316-342, 1991.
[4] C. Chen and K.C. Li, “Can SIR Be as Popular as Multiple Linear Regression?” Statistica Sinica, vol. 8, pp. 289-316, 1998.
[5] N. Duan and K.C. Li, “Slicing Regression: A Link Free Regression Method,” Annals of Statistics, vol. 19, pp. 505-530, 1991.
[6] P. Hall and K.C. Li, “On Almost Linearity of Low Dimensional Projection from High Dimensional Data,” Annals of Statistics, vol. 21, pp. 867-889, 1993.
[7] K.C. Li, “Nonlinear Confounding in High-Dimensional Regression,” Annals of Statistics, vol. 25, pp. 577-612, 1997.
[8] H.M. Wu, “Kernel Sliced Inverse Regression with Applications on Classification,” J. Computational and Graphical Statistics, vol. 17, no. 3, pp. 590-610, 2008.
[9] C.C. Chang and C.J. Lin, “LIBSVM: A Library for Support Vector Machines,” Software,, 2001.
[10] J.H. Friedman, “Multivariate Adaptative Regression Splines,” Annals of Statistics, vol. 19, pp. 1-67, 1991.
[11] G. Wahba, Spline Models for Observational Data. SIAM, 1990.
[12] G. Wahba, “Support Vector Machines, Reproducing Kernel Hillbert Spaces, and Randomized GACV,” Advances in Kernel Methods—Support Vector Learning, B. Schölkopf, C.J.C. Burges, and A.J. Smola, eds., The MIT Press, 1999.
[13] S. Saitoh, Integral Transforms, Reproducing Kernels and Their Applications. Addison Wesley Longman, 1997.
[14] J.R. Thompson and R.A. Tapia, Nonparametric Function Estimation, Modeling, and Simulation. SIAM, 1990.
[15] P. Diaconis and D. Freedman, “Asymptotics of Graphical Projection Pursuit,” Annals of Statistics, vol. 12, pp. 793-815, 1984.
[16] F.R. Bach and M.I. Jordan, “Kernel Independent Component Analysis,” J. Machine Learning Research, vol. 3, pp. 1-48, 2002.
[17] Y.J. Lee and S.Y. Huang, “Reduced Support Vector Machines: A Statistical Theory,” IEEE Trans. Neural Networks, vol. 18, no. 1, pp.1-13, Jan. 2007.
[18] Y.J. Lee, H.Y. Lo, and S.Y. Huang, “Incremental Reduced Support Vector Machines,” Proc. Int'l Conf. Informatics Cybernetics and System (ICICS), 2003.
[19] S. Mika, G. Rätsch, J. Weston, B. Schölkpf, and K.-R. Müller, “Fisher Discriminant Analysis with Kernels,” Proc. IEEE Workshop Neural Networks for Signal Processing IX, pp. 41-48, 1999.
[20] J.C. Platt, “Sequential Minimal Optimization: A Fast Algorithm for Training Support Vector Machines,” Advances in Kernel Methods —Support Vector Learning, B. Schölkopf, C.J.C. Burges, and A.J.Smola, eds., The MIT Press, 1999.
[21] A. Asuncion and D.J. Newman, “UCI Repository of Machine Learning Databases,” mlrepository.html , 2007.
[22] MATLAB, User's Guide. The MathWorks, Inc., 1992.
[23] Y.J. Lee and O.L. Mangasarian, “SSVM: A Smooth Support Vector Machine,” Computational Optimization and Applications, vol. 20, pp.5-22, 2001.
[24] L. Ferré, “Determining the Dimension in Sliced Inverse Regression and Related Methods,” J. Am. Statistical Assoc., vol. 93, pp. 132-140, 1998.
[25] J.R. Schott, “Determining the Dimensionality in Sliced Inverse Regression,” J. Am. Statistical Assoc., vol. 89, pp. 141-148, 1994.
[26] S. Velilla, “Assessing the Number of Linear Components in a General Regression Problem,” J. Am. Statistical Assoc., vol. 93, pp.1088-1098, 1998.
[27] Z. Ye and R.E. Weiss, “Using the Bootstrap to Select One of a New Class of Dimension Reduction Methods,” J. Am. Statistical Assoc., vol. 98, pp. 968-979, 1998.
[28] I.P. Tu, H. Chen, H.P. Wu, and X. Chen, “An Eigenvector Variability Plot,” to be published in Statistica Sinica, vol. 19, no 4, Oct. 2009.
[29] R.B. Cattell, “The Scree Test for the Number of Factors,” Multivariate Behavioral Research, vol. 1, pp. 245-276, 1966.
[30] B.W. Silverman, Density Estimation for Statistics and Data Analysis. Chapman and Hall, 1986.
[31] C.M. Huang, Y.J. Lee, D.K.J. Lin, and S.Y. Huang, “Model Selection for Support Vector Machines via Uniform Design,” Machine Learning and Robust Data Mining of Computational Statistics and Data Analysis, special issue, vol. 52, pp. 335-346, 2007.
16 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool