This Article 
 Bibliographic References 
 Add to: 
Kernel Entropy Component Analysis
May 2010 (vol. 32 no. 5)
pp. 847-860
Robert Jenssen, University of Tromsø, Tromsø
We introduce kernel entropy component analysis (kernel ECA) as a new method for data transformation and dimensionality reduction. Kernel ECA reveals structure relating to the Renyi entropy of the input space data set, estimated via a kernel matrix using Parzen windowing. This is achieved by projections onto a subset of entropy preserving kernel principal component analysis (kernel PCA) axes. This subset does not need, in general, to correspond to the top eigenvalues of the kernel matrix, in contrast to the dimensionality reduction using kernel PCA. We show that kernel ECA may produce strikingly different transformed data sets compared to kernel PCA, with a distinct angle-based structure. A new spectral clustering algorithm utilizing this structure is developed with positive results. Furthermore, kernel ECA is shown to be an useful alternative for pattern denoising.

[1] R.O. Duda, P.E. Hart, and D.G. Stork, Pattern Classification. John Wiley & Sons, 2001.
[2] S. Theodoridis and K. Koutroumbas, Pattern Recognition. Academic Press, 1999.
[3] I.T. Jolliffe, Principal Component Analysis. Springer Verlag, 1986.
[4] H. Hotelling, "Analysis of a Complex of Statistical Variables into Principal Components," J. Educational Psychology, vol. 24, pp. 417-441, 1933.
[5] B. Schölkopf, A.J. Smola, and K.-R. Müller, "Nonlinear Component Analysis as a Kernel Eigenvalue Problem," Neural Computation, vol. 10, pp. 1299-1319, 1998.
[6] H. Zha, X. He, C. Ding, H. Simon, and M. Gu, "Spectral Relaxation for K-means Clustering," Advances in Neural Information Processing Systems, 14, pp. 1057-1064, MIT Press, 2002.
[7] J. MacQueen, "Some Methods for Classification and Analysis of Multivariate Observations," Proc. Berkeley Symp. Math. Statistics and Probability, pp. 281-297. 1967.
[8] J.T. Kwok and I.W. Tsang, "The Pre Image Problem in Kernel Methods," IEEE Trans. Neural Networks, vol. 15, no. 6, pp. 1517-1525, 2004.
[9] S. Mika, B. Schölkopf, A. Smola, K.R. Müller, M. Scholz, and G. Rätsch, "Kernel PCA and Denoising in Feature Space," Advances in Neural Information Processing Systems, 11, pp. 536-542, MIT Press, 1999.
[10] B. Schölkopf, S. Mika, C.J.C. Burges, P. Knirsch, K.-R. Müller, G. Rätsch, and A.J. Smola, "Input Space versus Feature Space in Kernel-Based Methods," IEEE Trans. Neural Networks, vol. 10, no. 5, pp. 1299-1319, 1999.
[11] M.L. Braun, J.M. Buhmann, and K.-R. Müller, "On Relevant Dimensions in Kernel Feature Spaces," J. Machine Learning Research, vol. 9, pp. 1875-1908, 2008.
[12] A.Y. Ng, M. Jordan, and Y. Weiss, "On Spectral Clustering: Analysis and an Algorithm," Advances in Neural Information Processing Systems, 14, pp. 849-856, MIT Press, 2002.
[13] M. Belkin and P. Niyogi, "Laplacian Eigenmaps for Dimensionality Reduction and Data Representation," Neural Computation, vol. 15, pp. 1373-1396, 2003.
[14] J. Shi and J. Malik, "Normalized Cuts and Image Segmentation," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 888-905, Aug. 2000.
[15] S. Roweis and L. Saul, "Nonlinear Dimensionality Reduction by Locally Linear Embedding," Science, vol. 290, pp. 2323-2326, 2000.
[16] J. Tenenbaum, V. de Silva, and J.C. Langford, "A Global Geometric Framework for Nonlinear Dimensionality Reduction," Science, vol. 290, pp. 2319-2323, 2000.
[17] K.Q. Weinberger and L.K. Saul, "Unsupervised Learning of Image Manifolds by Semidefinite Programming," Int'l J. Computer Vision, vol. 70, no. 1, pp. 77-90, 2006.
[18] L.K. Saul, K.Q. Weinberger, J.H. Ham, F. Sha, and D.D. Lee, "Spectral Methods for Dimensionality Reduction," Semisupervised Learning, O. Chapelle, B. Schölkopf, and A. Zien, eds., chapter 1, MIT Press, 2005.
[19] C.J.C. Burges, "Geometric Methods for Feature Extraction and Dimensional Reduction," Data Mining and Knowledge Discovery Handbook: A Complete Guide for Researchers and Practitioners, O. Maimon and L. Rokach, eds., chapter 4, Kluwer Academic Publishers, 2005.
[20] R. Jenssen, T. Eltoft, M. Girolami, and D. Erdogmus, "Kernel Maximum Entropy Data Transformation and an Enhanced Spectral Clustering Algorithm," Advances in Neural Information Processing Systems 19, pp. 633-640, MIT Press, 2007.
[21] R. Jenssen and O.K. Storås, "Kernel ECA Pre-Images for Pattern Denoising," Proc. Scandinavian Conf. Image Analysis, June 2009.
[22] J. Shawe-Taylor and N. Cristianini, Kernel Methods for Pattern Analysis. Cambridge Univ. Press, 2004.
[23] J. Mercer, "Functions of Positive and Negative Type and Their Connection with the Theory of Integral Equations," Philosophical Trans. Royal Soc. London, vol. A, pp. 415-446, 1909.
[24] K.R. Müller, S. Mika, G. Rätsch, K. Tsuda, and B. Schölkopf, "An Introduction to Kernel-Based Learning Algorithms," IEEE Trans. Neural Networks, vol. 12, no. 2, pp. 181-201, Mar. 2001.
[25] C.K.I. Williams, "On a Connection between Kernel PCA and Metric Multidimensional Scaling," Machine Learning, vol. 46, pp. 11-19, 2002.
[26] A. Renyi, "On Measures of Entropy and Information," Selected Papers of Alfred Renyi, vol. 2, pp. 565-580, Akademiai Kiado, 1976.
[27] E. Parzen, "On the Estimation of a Probability Density Function and the Mode," The Annals of Math. Statistics, vol. 32, pp. 1065-1076, 1962.
[28] B.W. Silverman, Density Estimation for Statistics and Data Analysis. Chapman and Hall, 1986.
[29] M. Girolami, "Orthogonal Series Density Estimation and the Kernel Eigenvalue Problem," Neural Computation, vol. 14, no. 3, pp. 669-688, 2002.
[30] R. Murphy and D. Ada, "UCI Repository of Machine Learning Databases," technical report, Dept. of Computer Science, Univ. of California, Irvine, 1994.
[31] R. Jenssen and T. Eltoft, "A New Information Theoretic Analysis of Sum-of-Squared-Error Kernel Clustering," Neurocomputing, vol. 72, nos. 1-3, pp. 23-31, 2008.

Index Terms:
Spectral data transformation, Renyi entropy, Parzen windowing, kernel PCA, clustering, pattern denoising.
Robert Jenssen, "Kernel Entropy Component Analysis," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 5, pp. 847-860, May 2010, doi:10.1109/TPAMI.2009.100
Usage of this product signifies your acceptance of the Terms of Use.