This Article 
 Bibliographic References 
 Add to: 
Metric Learning for Text Documents
April 2006 (vol. 28 no. 4)
pp. 497-508
Many algorithms in machine learning rely on being given a good distance metric over the input space. Rather than using a default metric such as the Euclidean metric, it is desirable to obtain a metric based on the provided data. We consider the problem of learning a Riemannian metric associated with a given differentiable manifold and a set of points. Our approach to the problem involves choosing a metric from a parametric family that is based on maximizing the inverse volume of a given data set of points. From a statistical perspective, it is related to maximum likelihood under a model that assigns probabilities inversely proportional to the Riemannian volume element. We discuss in detail learning a metric on the multinomial simplex where the metric candidates are pull-back metrics of the Fisher information under a Lie group of transformations. When applied to text document classification the resulting geodesic distance resemble, but outperform, the tfidf cosine similarity measure.

[1] S. Amari and H. Nagaoka, Methods of Information Geometry. Am. Math. Soc., 2000.
[2] A. Gous, “Exponential and Spherical Subfamily Models,” Stanford Univ., 1998.
[3] K. Hall and T. Hofmann, “Learning Curved Multinomial Subfamilies for Natural Language Processing and Information Retrieval,” Proc. 17th Int'l Conf. Machine Learning, P. Langley, ed., pp. 351-358, 2000.
[4] T. Joachims, “The Maximum Margin Approach to Learning Text Classifiers Methods, Theory and Algorithms,” PhD thesis, Dortmund Univ., 2000.
[5] R.E. Kass and P.W. Voss, Geometrical Foudnation of Asymptotic Inference. John Wiley and Sons, Inc., 1997.
[6] G.R.G. Lanckriet, P. Bartlett, N. Cristianini, L. ElGhaoui, and M.I. Jordan, “Learning the Kernel Matrix with Semidefinite Programming,” J. Machine Learning Research, vol. 5, pp. 27-72, 2004.
[7] G. Lebanon, “Riemannian Geometry and Statistical Machine Learning,” Technical Report CMU-LTI-05-189, Carengie Mellon Univ., 2005.
[8] J.M. Lee, Introduction to Topological Manifolds. Springer, 2000.
[9] J.M. Lee, Introduction to Smooth Manifolds. Springer, 2002.
[10] M.K. Murray and J.W. Rice, Differential Geometry and Statistics. CRC Press, 1993.
[11] S. Roweis and L. Saul, “Nonlinear Dimensionality Reduction by Locally Linear Embedding,” Science, vol. 290, p. 2323, 2000.
[12] L.K. Saul and M.I. Jordan, “A Variational Principle for Model-Based Interpolation,” Advances in Neural Information Processing Systems 9, M.C. Mozer, M.I. Jordan, and T. Petsche, eds., 1997.
[13] E.P. Xing, A.Y. Ng, M.I. Jordan, and S. Russel, “Distance Metric Learning with Applications to Clustering with Side Information,” Advances in Neural Information Processing Systems, 15, S. Becker, S. Thrun, and K. Obermayer, eds., pp. 505-512, 2003.

Index Terms:
Distance learning, text analysis, machine learning.
Guy Lebanon, "Metric Learning for Text Documents," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 4, pp. 497-508, April 2006, doi:10.1109/TPAMI.2006.77
Usage of this product signifies your acceptance of the Terms of Use.