This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Principal Surfaces from Unsupervised Kernel Regression
September 2005 (vol. 27 no. 9)
pp. 1379-1391
We propose a nonparametric approach to learning of principal surfaces based on an unsupervised formulation of the Nadaraya-Watson kernel regression estimator. As compared with previous approaches to principal curves and surfaces, the new method offers several advantages: First, it provides a practical solution to the model selection problem because all parameters can be estimated by leave-one-out cross-validation without additional computational cost. In addition, our approach allows for a convenient incorporation of nonlinear spectral methods for parameter initialization, beyond classical initializations based on linear PCA. Furthermore, it shows a simple way to fit principal surfaces in general feature spaces, beyond the usual data space setup. The experimental results illustrate these convenient features on simulated and real data.

[1] P. Meinicke, “Unsupervised Learning in a Generalized Regression Framework,” PhD dissertation, Universität Bielefeld, 2000.
[2] T. Hastie, “Principal Curves and Surfaces,” PhD dissertation, Stanford Univ., 1984.
[3] B. Kégl, A. Krzyzak, T. Linder, and K. Zeger, “Learning and Design of Principal Curves,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 3, pp. 281-297, Mar. 2000.
[4] M. LeBlanc and R. Tibshirani, “Adaptive Principal Surfaces,” J. Am. Statistical Assoc., vol. 89, pp. 53-64, 1994.
[5] A.J. Smola, S. Mika, B. Schölkopf, and R.C. Williamson, “Regularized Principal Manifolds,” J. Machine Learning Research, vol. 1, pp. 179-209, 2001.
[6] K. Chang and J. Ghosh, “A Unified Model for Probabilistic Principal Surfaces,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 1, pp. 22-41, Jan. 2001.
[7] C.M. Bishop, M. Svensen, and C.K.I. Williams, “GTM: The Generative Topographic Mapping,” Neural Computation, vol. 10, no. 1, pp. 215-234, 1998.
[8] T. Kohonen, Self-Organizing Maps. Springer, 1995.
[9] H. Ritter, T. Martinetz, and K. Schulten, Neural Computation and Self-Organizing Maps. Addison Wesley, 1992.
[10] J. Walter and H. Ritter, “Rapid Learning with Parametrized Self-Organizing Maps,” Neurocomputing, vol. 12, pp. 131-153, 1996.
[11] C.M. Bishop, M. Svensén, and C.K.I. Williams, “Developments of the Generative Topographic Mapping,” Neurocomputing, vol. 21, pp. 203-224, 1998.
[12] B. Schölkopf and A.J. Smola, Learning with Kernels. MIT Press, 2002.
[13] E.A. Nadaraya, “On Estimating Regression,” Theory of Probability and Its Application, vol. 10, pp. 186-190, 1964.
[14] G. Watson, “Smooth Regression Analysis,” Sankhya Series A, vol. 26, pp. 359-372, 1964.
[15] D.W. Scott, Multivariate Density Estimation. Wiley, 1992.
[16] T. Hastie, R. Tibshirani, and J.H. Friedman, The Elements of Statistical Learning. Springer-Verlag, 2001.
[17] S. Sandilya and S.R. Kulkarni, “Principal Curves with Bounded Turn,” IEEE Trans. Information Theory, vol. 48, pp. 2789-2793, 2002.
[18] S. Roweis and L. Saul, “Nonlinear Dimensionality Reduction by Locally Linear Embedding,” Science, vol. 290, pp. 2323-2326, 2000.
[19] J.B. Tenenbaum, V. de Silva, and J.C. Langford, “A Global Geometric Framework for Nonlinear Dimensionality Reduction,” Science, vol. 290, pp. 2319-2323, 2000.
[20] M. Belkin and P. Niyogi, “Laplacian Eigenmaps for Dimensionality Reduction and Data Representation,” Neural Computation, vol. 15, no. 6, pp. 1373-1396, June 2003.
[21] H. Reiner, Handbook of Global Optimization. Dordrecht: Kluwer Academic Publishers, 1995.
[22] G.H. Bakir, J. Weston, and B. Schölkopf, “Learning to Find Pre-Images,” Advances in Neural Information Processing Systems, 2003.
[23] J.T. Kwok and I.W. Tsang, “Finding the Pre Images in Kernel Principal Component Analysis,” Proc. Sixth Ann. Workshop Kernel Machines, 2002.
[24] J. Weston, O. Chapelle, A. Elisseeff, B. Schölkopf, and V. Vapnik, “Kernel Dependency Estimation,” Advances in Neural Information Processing Systems 15, 2003.
[25] B. Silverman, Density Estimation for Statistics and Data Analysis. London-New York: Chapman and Hall, 1986.
[26] J.C. Lagarias, J.A. Reeds, M.H. Wright, and P.E. Wright, “Convergence Properties of the Nelder-Mead Simplex Algorithm in Low Dimensions,” SIAM J. Optimization, vol. 9, pp. 112-147, 1998.
[27] M. Riedmiller and H. Braun, “A Direct Adaptive Method for Faster Backpropagation Learning: The RPROP Algorithm,” Proc. IEEE Int'l Conf. Neural Networks, pp. 586-591, 1993.
[28] J.W. Sammon, Jr., “A Non-Linear Mapping for Data Structure Analysis,” IEEE Trans. Computers, vol. 18, pp. 401-409, 1969.
[29] B. Schölkopf, A.J. Smola, and K.-R. Müller, “Nonlinear Component Analysis as a Kernel Eigenvalue Problem,” Neural Computation, vol. 10, pp. 1299-1319, 1998.
[30] Y. Bengio, J.-F.P.O. Delalleau, P. Vincent, and M. Ouimet, “Learning Eigenfunctions Links Spectral Embedding and Kernel PCA,” Neural Computation, vol. 16, pp. 2197-2219, 2004.

Index Terms:
Index Terms- Dimensionality reduction, principal curves, principal surfaces, density estimation, model selection, kernel methods.
Citation:
Peter Meinicke, Stefan Klanke, Roland Memisevic, Helge Ritter, "Principal Surfaces from Unsupervised Kernel Regression," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 9, pp. 1379-1391, Sept. 2005, doi:10.1109/TPAMI.2005.183
Usage of this product signifies your acceptance of the Terms of Use.