Issue No. 06 - June (2015 vol. 27)
Yujie He , Department of Computer Science and Engineering, Washington University in St. Louis, St. Louis, MO
Yi Mao , , Hefei University of Technology, Hefei, China
Wenlin Chen , Department of Computer Science and Engineering, Washington University in St. Louis, St. Louis, MO
Yixin Chen , Department of Computer Science and Engineering, Washington University in St. Louis, St. Louis, MO
Metric learning, the task of learning a good distance metric, is a key problem in machine learning with ample applications. This paper introduces a novel framework for nonlinear metric learning, called kernel density metric learning (KDML), which is easy to use and provides nonlinear, probability-based distance measures. KDML constructs a direct nonlinear mapping from the original input space into a feature space based on kernel density estimation. The nonlinear mapping in KDML embodies established distance measures between probability density functions, and leads to accurate classification on datasets for which existing linear metric learning methods would fail. It addresses the severe challenge to distance-based classifiers when features are from heterogeneous domains and, as a result, the Euclidean or Mahalanobis distance between original feature vectors is not meaningful. We also propose two ways to determine the kernel bandwidths, including an adaptive local scaling approach and an integrated optimization algorithm that learns the Mahalanobis matrix and kernel bandwidths together. KDML is a general framework that can be combined with any existing metric learning algorithm. As concrete examples, we combine KDML with two leading metric learning algorithms, large margin nearest neighbors (LMNN) and neighborhood component analysis (NCA). KDML can naturally handle not only numerical features, but also categorical ones, which is rarely found in previous metric learning algorithms. Extensive experimental results on various datasets show that KDML significantly improves existing metric learning algorithms in terms of classification accuracy.
Euclidean distance, Kernel, Learning systems, Vectors, Density measurement, Algorithm design and analysis
Y. He, Y. Mao, W. Chen and Y. Chen, "Nonlinear Metric Learning with Kernel Density Estimation," in IEEE Transactions on Knowledge & Data Engineering, vol. 27, no. 6, pp. 1602-1614, 2015.