The Community for Technology Leaders
RSS Icon
Issue No.09 - September (2008 vol.30)
pp: 1534-1546
Classification can often benefit from efficient feature selection. However, the presence of linearly nonseparable data, quick response requirement, small sample problem and noisy features makes the feature selection quite challenging. In this work, a class separability criterion is developed in a high-dimensional kernel space, and feature selection is performed by the maximization of this criterion. To make this feature selection approach work, the issues of automatic kernel parameter tuning, the numerical stability, and the regularization for multi-parameter optimization are addressed. Theoretical analysis uncovers the relationship of this criterion to the radius-margin bound of the SVMs, the KFDA, and the kernel alignment criterion, providing more insight on using this criterion for feature selection. This criterion is applied to a variety of selection modes with different search strategies. Extensive experimental study demonstrates its efficiency in delivering fast and robust feature selection.
Feature evaluation and selection, Pattern analysis
Lei Wang, "Feature Selection with Kernel Class Separability", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.30, no. 9, pp. 1534-1546, September 2008, doi:10.1109/TPAMI.2007.70799
[1] V. Vapnik, The Nature of Statistical Learning Theory. Springer Verlag, 1995.
[2] B. Schölkopf and A. Smola, Learning with Kernels Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, 2002.
[3] S. Haykin, Neural Networks: A Comprehensive Foundation, second ed. Prentice Hall, 1999.
[4] B. Schölkopf, A.J. Smola, and K.-R. Müller, “Kernel Principal Component Analysis,” Advances in Kernel Methods—Support Vector Learning, pp. 327-352, 1999.
[5] S. Mika, G. Rätsch, J. Weston, B. Schölkopf, and K.-R. Müller, “Fisher Discriminant Analysis with Kernels,” Neural Networks for Signal Processing IX, Y.-H. Hu, J. Larsen, E. Wilson, and S. Douglas, eds., pp. 41-48, IEEE, 1999.
[6] K. Fukunaga, Introduction to Statistical Pattern Recognition. Academic Press, 1990.
[7] S. Theodoridis and K. Koutroumbas, Pattern Recognition. Academic Press, 1999.
[8] R.O. Duda, D.G. Stork, and P.E. Hart, Pattern Classification, second ed. John Wiley & Sons, 2001.
[9] A.R. Webb, Statistical Pattern Recognition, second ed. John Wiley & Sons, 2002.
[10] O. Chapelle, V. Vapnik, O. Bousquet, and S. Mukherjee, “Choosing Multiple Parameters for Support Vector Machines,” Machine Learning, vol. 46, nos. 1-3, pp. 131-159, 2002.
[11] J. Weston, S. Mukherjee, O. Chapelle, M. Pontil, T. Poggio, and V. Vapnik, “Feature Selection for SVMs,” Advances in Neural Information Processing Systems 13—Proc. Ann. Conf. Neural Information Processing Systems (NIPS '00), T.K. Leen, T.G. Dietterich, and V. Tresp, eds., pp. 668-674, MIT Press, 2000.
[12] L.T.H. An and P.D. Tao, “DC Programming. Theory, Algorithms and Applications: The State of the Art,” Proc. First Int'l Workshop Global Constrained Optimization and Constraint Satisfaction (COCOS '02), pp. 131-159, Oct. 2002.
[13] N. Cristianini, J. Shawe-Taylor, A. Elisseeff, and J.S. Kandola, “On Kernel-Target Alignment,” Advances in Neural Information Processing Systems 14—Proc. Ann. Conf. Neural Information Processing Systems (NIPS '01), T.G. Dietterich, S. Becker, and Z. Ghahramani, eds., pp. 367-373, MIT Press, 2001.
[14] L. Xu, K. Crammer, and D. Schuurmans, “Robust Support Vector Machine Training via Convex Outlier Ablation,” Proc. 21st Nat'l Conf. Artificial Intelligence (AAAI '06), 2006.
[15] C.-C. Chang and C.-J. Lin, “LIBSVM: A Library for Support Vector Machines,”, 2001.
[16] W.H. Press, B.P. Flannery, S.A. Teukolsky, and W.T. Vetterling, Numerical Recipes in C: The Art of Scientific Computing. Cambridge Univ. Press, 1988.
[17] T.G. Dietterich, “Approximate Statistical Test for Comparing Supervised Classification Learning Algorithms,” Neural Computation, vol. 10, no. 7, pp. 1895-1923, 1998.
[18] K. Tieu and P. Viola, “Boosting Image Retrieval,” Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition (CVPR '00), pp. 228-235, 2000.
[19] L. Bo, L. Wang, and L. Jiao, “Feature Scaling for Kernel Fisher Discriminant Analysis Using Leave-One-Out Cross Validation,” Neural Computation, vol. 18, no. 4, pp. 961-978, 2006.
[20] G. Cawley and N.L.C. Talbot, “Efficient Leave-One-Out Cross-Validation of Kernel Fisher Discriminant Classifiers,” Pattern Recognition, vol. 36, no. 11, pp. 2585-2592, Nov. 2003.
22 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool