This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Constructing Descriptive and Discriminative Nonlinear Features: Rayleigh Coefficients in Kernel Feature Spaces
May 2003 (vol. 25 no. 5)
pp. 623-633

Abstract—We incorporate prior knowledge to construct nonlinear algorithms for invariant feature extraction and discrimination. Employing a unified framework in terms of a nonlinearized variant of the Rayleigh coefficient, we propose nonlinear generalizations of Fisher's discriminant and oriented PCA using support vector kernel functions. Extensive simulations show the utility of our approach.

[1] K.I. Diamantaras and S.Y. Kung, Principal Component Neural Networks: Theory and Applications. John Wiley and Sons, 1996.
[2] B. Schölkopf, A. Smola, and K.-R. Müller, "Nonlinear Component Analysis as a Kernel Eigenvalue Problem," Neural Computation, Vol. 10, 1998, pp. 1299-1319.
[3] R.O. Duda and P.E. Hart, Pattern Classification and Scene Analysis. John Wiley and Sons, 1973.
[4] R.A. Fisher, “The Use of Multiple Measurements in Taxonomic Problems,” Annals of Eugenics, vol. 7, pp. 179-188, 1936.
[5] S. Mika, G. Rätsch, and K.-R. Müller, “A Mathematical Programming Approach to the Kernel Fisher Algorithm,” Proc. Conf. Neural Information Processing Systems, T.K. Leen, T.G. Dietterich, and V. Tresp, eds., vol. 13, pp. 591-597, 2001.
[6] P. Simard, Y. Le Cun, J. Denker, and B. Victorri, “Transformation Invariance in Pattern Recognition-Tangent Distance and Tangent Propagation,” Lecture Notes in Computer Science, vol. 1524, pp. 239-274, 1998.
[7] B. Schölkopf and A.J. Smola, Learning with Kernels. Mass.: MIT Press, 2002.
[8] S. Mika, G. Rätsch, J. Weston, B. Schölkopf, and K. Müller, “Fisher Discriminant Analysis with Kernels,” Neural Networks for Signal Processing, vol. 9, pp. 41-48, 1999.
[9] V. Roth and V. Steinhage, “Nonlinear Discriminant Analysis Using Kernel Functions,” Proc. Conf. Advances in Neural Information Processing Systems, S.A. Solla, T.K. Leen, and K.-R. Müller, eds., vol. 12, pp. 568-574, 2000.
[10] G. Baudat and F. Anouar, “Generalized Discriminant Analysis Using a Kernel Approach,” Neural Computation, vol. 12, no. 10, pp. 2385-2404, 2000.
[11] K.-R. Müller, S. Mika, G. Rätsch, K. Tsuda, and B. Schölkopf, “An Introduction to Kernel-Based Learning Algorithms,” IEEE Trans. Neural Networks, vol. 12, no. 2, pp. 181-201, 2001.
[12] B.E. Boser, I.M. Guyon, and V.N. Vapnik, "A Training Algorithm for Optimal Margin Classifiers," Proc. Fifth Ann. Workshop Computational Learning Theory, ACM Press, New York, 1992, pp. 144-152.
[13] V.N. Vapnik, Statistical Learning Theory, John Wiley&Sons, 1998.
[14] G. Rätsch, S. Mika, B. Schölkopf, and K.-R. Müller, “Constructing Boosting Algorithms from SVMs: An Application to One-Class Classification,” IEEE Trans. Pattern and Machine Analysis, vol. 24, no. 9, pp. 1184-1199, Sept. 2002. earlier version is GMD Technical Report no. 119, 2000.
[15] B. Schölkopf, R. Herbrich, and A.J. Smola, “A Generalized Representer Theorem,” Proc. COLT/EuroCOLT, Springer, D.P. Helmbold and R.C. Williamson, eds., pp. 416-426, 2001.
[16] C. Saunders, A. Gammermann, and V. Vovk, “Ridge Regression Learning Algorithm in Dual Variables,” Proc. 15th Int'l Conf. Machine Learning, pp. 515-521, 1998.
[17] J.A.K. Suykens and J. Vanderwalle, “Least Squares Support Vector Machine Classifiers,” Neural Processing Letters, vol. 9, no. 3, pp. 293-300, 1999.
[18] M.E. Tipping, “The Relevance Vector Machine,” Proc. Conf. Advances in Neural Information Processing Systems, S.A. Solla, T.K. Leen, and K.-R. Müller, eds., vol. 12, pp. 652-658, 2000.
[19] C.M. Bishop, Neural Networks for Pattern Recognition. Clarendon Press, 1995.
[20] T.V. Gestel, J.A.K. Suykens, G. Lanckriet, A. Lambrechts, B. De Moor, and J. Vanderwalle, “Bayesian Framework for Least Squares Support Vector Machine Classifiers, Gaussian Processs and Kernel Fisher Discriminant Analysis,” technical report, Katholieke Universiteit Leuven, Aug. 2001.
[21] S. Mika, A.J. Smola, and B. Schölkopf, “An Improved Training Algorithm for Kernel Fisher Discriminants,” Proc. AISTATS, T. Jaakkola and T. Richardson, eds., pp. 98-104, 2001.
[22] A.J. Smola, B. Schölkopf, and K.-R. Müller, “The Connection between Regularization Operators and Support Vector Kernels,” Neural Networks, vol. 11, pp. 637-649, 1998.
[23] J. Platt, "Fast Training of SVMs Using Sequential Minimal Optimization," to be published in Advances in Kernel Methods—Support Vector Machine Learning, B. Schölkpf, C. Burges, and A. Smola, eds., MIT Press, Cambridge, Mass., 1998.
[24] S.S. Keerthi and S.K. Shevade, “SMO Algorithm for Least Squares SVM Formulations,” Technical Report CD-02-08, Nat'l Univ. of Singapore, 2002.
[25] B. Scholkopf et al., Input Space versus Feature Space in Kernel-Based Methods IEEE Trans. Neural Networks, vol. 10, no. 5, pp. 1000-1017, Sept. 1999.
[26] T. Graepel, R. Herbrich, B. Schölkopf, A.J. Smola, P.L. Bartlett, K.-R. Müller, K. Obermayer, and R.C. Williamson, “Classification on Proximity Data with LP-Machines,” Proc. ICANN '99, D. Willshaw and A. Murray, eds., vol. 1, pp. 304-309, 1999.
[27] G. Rätsch, A. Demiriz, and K. Bennett, “Sparse Regression Ensembles in Infinite and Finite Hypothesis Spaces,” Machine Learning, vol. 48, nos. 1-3, pp. 193-221, 2002. (Also NeuroCOLT2 Technical Report NC-TR-2000-085.)
[28] C.L. Blake and C.J. Merz, “UCI Repository of Machine Learning Databases,” http://www.outex.oulu.fihttp://www.ics.uci.edu/ ~mlearnMLRepository.html, a huge collection of artificial and real-world data sets, 1998.
[29] ftp://ftp.ics.uci.edu/pub/machine-learning-databases statlog/, benchmark repository used for the STATLOG competition. 2002.
[30] O. Bousquet and A. Elisseeff, “Stability and Generalization,” J. Machine Learning Research, vol. 2, pp. 499-526, Mar. 2002.

Index Terms:
Fisher's discriminant, nonlinear feature extraction, support vector machine, kernel functions, Rayleigh coefficient, oriented PCA.
Citation:
Sebastian Mika, Gunnar Rätsch, Jason Weston, Bernhard Schölkopf, Alex Smola, Klaus-Robert Müller, "Constructing Descriptive and Discriminative Nonlinear Features: Rayleigh Coefficients in Kernel Feature Spaces," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 5, pp. 623-633, May 2003, doi:10.1109/TPAMI.2003.1195996
Usage of this product signifies your acceptance of the Terms of Use.