The Community for Technology Leaders
RSS Icon
Issue No.12 - December (2009 vol.21)
pp: 1798-1802
Yiu-ming Cheung , Hong Kong Baptist University, Kowloon Tong
Hong Zeng , Hong Kong Baptist University, Kowloon Tong
In general, irrelevant features of high-dimensional data will degrade the performance of an inference system, e.g., a clustering algorithm or a classifier. In this paper, we therefore present a Local Kernel Regression (LKR) scoring approach to evaluate the relevancy of features based on their capabilities of keeping the local configuration in a small patch of data. Accordingly, a score index featuring applicability to both of supervised learning and unsupervised learning is developed to identify the relevant features within the framework of local kernel regression. Experimental results show the efficacy of the proposed approach in comparison with the existing methods.
Relevant features, feature selection, local kernel regression score, high-dimensional data.
Yiu-ming Cheung, Hong Zeng, "Local Kernel Regression Score for Selecting Features of High-Dimensional Data", IEEE Transactions on Knowledge & Data Engineering, vol.21, no. 12, pp. 1798-1802, December 2009, doi:10.1109/TKDE.2009.23
[1] A. Blum and P. Langley, “Selection of Relevant Features and Examples in Machine Learning,” Artificial Intelligence, vol. 97, nos.1/2, pp. 245-271, 1997.
[2] F. Chung, Spectral Graph Theory. Am. Math. Soc., 1997.
[3] D. Graham and N. Allinson, “Characterizing Virtual Eigensignatures for General Purpose Face Recognition,” Face Recognition: From Theory to Applications, Springer, 1998.
[4] T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning. Springer, 2003.
[5] X. He, D. Cai, and P. Niyogi, “Laplacian Score for Feature Selection,” Advances in Neural Information Processing Systems, vol. 18, pp. 507-514, 2005.
[6] E. Nadaraya, Nonparametric Estimation of Probability Densities and Regression Curves. Kluwer Academic, 1989.
[7] D. Newman, S. Hettich, C. Blake, and C. Merz, UCI Repository of Machine Learning Databases. Univ. of California, 1998.
[8] M. Robnik-Šikonja and I. Kononenko, “Theoretical and Empirical Analysis of ReliefF and RReliefF,” Machine Learning, vol. 53, no. 1, pp. 23-69, 2003.
[9] J. Shawe-Taylor and N. Cristianini, Kernel Methods for Pattern Analysis. Cambridge Univ. Press, 2004.
[10] T. Sim, S. Baker, and M. Bsat, “The CMU Pose, Illumination, and Expression Database,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 12, pp. 1615-1618, Dec. 2003.
[11] A. Tikhonov and V. Arsenin, Solutions of Ill-Posed Problems. John Wiley, 1977.
[12] L. Wolf and A. Shashua, “Feature Selection for Unsupervised and Supervised Inference: The Emergence of Sparsity in a Weight-Based Approach,” J. Machine Learning Research, vol. 6, pp. 1855-1887, 2005.
[13] M. Wu and B. Schölkopf, “A Local Learning Approach for Clustering,” Advances in Neural Information Processing Systems, vol. 19, pp. 1529-1536, 2007.
[14] M. Wu and B. Schölkopf, “Transductive Classification via Local Learning Regularization,” Proc. 11th Int'l Conf. Artificial Intelligence and Statistics, pp.628-635, 2007.
[15] M. Wu, K. Yu, S. Yu, and B. Schölkopf, “Local Learning Projections,” Proc. 24th Int'l Conf. Machine Learning, pp. 1039-1046, 2007.
[16] Z. Zhao and H. Liu, “Spectral Feature Selection for Supervised and Unsupervised Learning,” Proc. Int'l Conf. Machine Learning, pp. 1151-1158, 2007.
3 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool