The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.03 - March (2012 vol.24)
pp: 478-491
Lei Wu , University of Science and Technology of China, China
Steven C.H. Hoi , Nanyang Technological University, Singapore
Rong Jin , Michigan State University, East Lansing
Jianke Zhu , Chinese University of Hong Kong, Hong Kong
Nenghai Yu , University of Science and Technology of China, China
ABSTRACT
Learning distance functions with side information plays a key role in many data mining applications. Conventional distance metric learning approaches often assume that the target distance function is represented in some form of Mahalanobis distance. These approaches usually work well when data are in low dimensionality, but often become computationally expensive or even infeasible when handling high-dimensional data. In this paper, we propose a novel scheme of learning nonlinear distance functions with side information. It aims to learn a Bregman distance function using a nonparametric approach that is similar to Support Vector Machines. We emphasize that the proposed scheme is more general than the conventional approach for distance metric learning, and is able to handle high-dimensional data efficiently. We verify the efficacy of the proposed distance learning method with extensive experiments on semi-supervised clustering. The comparison with state-of-the-art approaches for learning distance functions with side information reveals clear advantages of the proposed technique.
INDEX TERMS
Bregman distance, distance functions, metric learning, convex functions.
CITATION
Lei Wu, Steven C.H. Hoi, Rong Jin, Jianke Zhu, Nenghai Yu, "Learning Bregman Distance Functions for Semi-Supervised Clustering", IEEE Transactions on Knowledge & Data Engineering, vol.24, no. 3, pp. 478-491, March 2012, doi:10.1109/TKDE.2010.215
REFERENCES
[1] K.S. Azoury and M.K. Warmuth, "Relative Loss Bounds for On-Line Density Estimation with the Exponential Family of Distributions," Machine Learning, vol. 43, no. 3, pp. 211-246, 2001.
[2] A. Banerjee, S. Merugu, I. Dhillon, and J. Ghosh, "Clustering with Bregman Divergences," J. Machine Learning Research, vol. 6, pp. 1705-1749, Dec. 2005.
[3] A. Bar-Hillel, T. Hertz, N. Shental, and D. Weinshall, "Learning a Mahalanobis Metric from Equivalence Constraints," J. Machine Learning Research, vol. 6, pp. 937-965, 2005.
[4] D. Beeferman and A. Berger, "Agglomerative Clustering of a Search Engine Query Log," Proc. Sixth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 407-416, 2000.
[5] L. Bregman, "The Relaxation Method of Finding the Common Points of Convex Sets and Its Application to the Solution of Problems in Convex Programming," USSR Computational Math. and Math. Physics, vol. 7, pp. 200-217, 1967.
[6] R.L. Cilibrasi and P.M.B. Vitanyi, "The Google Similarity Distance," IEEE Trans. Knowledge and Data Eng., vol. 19, no. 3, pp. 370-383, Mar. 2007.
[7] C. Cortes and V. Vapnik, "Support-Vector Networks," Machine Learning, vol. 20, no. 3, pp. 273-297, 1995.
[8] T. Cover and P. Hart, "Nearest Neighbor Pattern Classification," IEEE Trans. Information Theory, vol. 13, no. 1, pp. 21-27, Jan. 1967.
[9] J.V. Davis, B. Kulis, P. Jain, S. Sra, and I.S. Dhillon, "Information-Theoretic Metric Learning," Proc. 24th Int'l Conf. Machine Learning (ICML '07), pp. 209-216, 2007.
[10] P. Fraundorf, "Thermal Roots of Correlation-Based Complexity," Complexity, vol. 13, no. 3, pp. 18-26, 2008.
[11] K. Fukunaga, Introduction to Statistical Pattern Recognition. Elsevier, 1990.
[12] A. Globerson and S. Roweis, "Metric Learning by Collapsing Classes," Proc. Advances in Neural Information Processing Systems (NIPS), pp. 451-458, 2005.
[13] J.A. Hartigan and M.A. Wong, "A k-Means Clustering Algorithm," Applied Statistics, vol. 28, no. 1, pp. 100-108, 1979.
[14] X. He, D. Cai, S. Yan, and H.-J. Zhang, "Neighborhood Preserving Embedding," Proc. 10th IEEE Int'l Conf. Computer Vision (ICCV '05), vol. 2, pp. 1208-1213, 2005.
[15] S.C.H. Hoi, W. Liu, and S.-F. Chang, "Semi-Supervised Distance Metric Learning for Collaborative Image Retrieval," ACM Trans. Multimedia Computing, Comm., and Applications, vol. 6, no. 3, pp. 18:1-18:26, 2010.
[16] S.C.H. Hoi, W. Liu, M.R. Lyu, and W.-Y. Ma, "Learning Distance Metrics with Contextual Constraints for Image Retrieval," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2006.
[17] S.C.H. Hoi and M.R. Lyu, "A Multimodal and Multilevel Ranking Scheme for Large-Scale Video Retrieval," IEEE Trans. Multimedia, vol. 10, no. 4, pp. 607-619, June 2008.
[18] S.C.H. Hoi, M.R. Lyu, and R. Jin, "A Unified Log-Based Relevance Feedback Scheme for Image Retrieval," IEEE Trans. Knowledge and Data Eng., vol. 18, no. 4, pp. 509-204, Apr. 2006.
[19] G.H.J. Goldberger, S. Roweis, and R. Salakhutdinov, "Neighbourhood Components Analysis," Proc. Advances in Neural Information Processing Systems, 2005.
[20] A. Jain, M. Murty, and P. Flynn, "Data Clustering: A Review," ACM Computing Surveys, vol. 31, no. 3, pp. 264-323, 1999.
[21] H. Jeffreys and B.S. Jeffreys, "Mean-Value Theorems," 1.13 in Methods of Mathematical Physics, pp. 49-50, Cambridge Univ. Press, 1988.
[22] S.C. Johnson, "Hierarchical Clustering Schemes," Psychometrika, vol. 32, no. 3, pp. 241-254, 1967.
[23] R. Kumar and S. Vassilvitskii, "Generalized Distances between Rankings," Proc. 19th Int'l Conf. World Wide Web (WWW '10), pp. 571-580, 2010.
[24] J. Lee and C. Zhang, "Classification of Gene-Expression Data: The Manifold-Based Metric Learning Way," Pattern Recognition, vol. 39, no. 12, pp. 2450-2463, 2006.
[25] Y. Liu, R. Jin, and A.K. Jain, "Boostcluster: Boosting Clustering by Pairwise Constraints," Proc. 13th ACM SIGKDD Conf. Knowledge Discovery and Data Mining, pp. 450-459, 2007.
[26] A. Lukasova, "Hierarchical Agglomerative Clustering Procedure," Pattern Recognition, vol. 11, nos. 5/6, pp. 365-381, 1979.
[27] D. Maesschalck, R.D. Jouan-Rimbaud, and D. Massart, "The Mahalanobis Distance," Chemometrics and Intelligent Laboratory Systems, vol. 50, pp. 1-18, 2000.
[28] P. Mahalanobis, "On the Generalised Distance in Statistics," Proc. Nat'l Institute of Sciences of India, vol. 2, pp. 49-55, 1936.
[29] A.W. Moore, J. Schneider, and K. Deng, "Efficient Locally Weighted Polynomial Regression Predictions," Proc. 14th Int'l Conf. Machine Learning (ICML '97), pp. 236-244, 1997.
[30] S. Roweis and L. Saul, "Nonlinear Dimensionality Reduction by Locally Linear Embedding," Science, vol. 290, no. 5500, pp. 2323-2326, 2000.
[31] S. Shalev-Shwartz, Y. Singer, and N. Srebro, "Pegasos: Primal Estimated Sub-Gradient Solver for SVM," Proc. 24th Int'l Conf. Machine Learning (ICML '07), pp. 807-814, 2007.
[32] L. Si, R. Jin, S.C.H. Hoi, and M.R. Lyu, "Collaborative Image Retrieval via Regularized Metric Learning," ACM Multimedia Systems J., vol. 12, no. 1, pp. 34-44, 2006.
[33] C. Silverstein, H. Marais, M. Henzinger, and M. Moricz, "Analysis of a Very Large Web Search Engine Query Log," SIGIR Forum, vol. 33, no. 1, pp. 6-12, 1999.
[34] J.B. Tenenbaum, V. de Silva, and J.C. Langford, "A Global Geometric Framework for Nonlinear Dimensionality Reduction," Science, vol. 290, no. 5500, pp. 2319-2323, 2000.
[35] T.H. Tomboy, A. Bar-hillel, and D. Weinshall, "Boosting Margin Based Distance Functions for Clustering," ICML '04: Proc. 21st Int'l Conf. Machine Learning, pp. 393-400, 2004.
[36] K. Wagstaff, C. Cardie, S. Rogers, and S. Schrödl, "Constrained k-Means Clustering with Background Knowledge," Proc. 18th Int'l Conf. Machine Learning (ICML '01), pp. 577-584, 2001.
[37] K. Weinberger, J. Blitzer, and L. Saul, "Distance Metric Learning for Large Margin Nearest Neighbor Classification," Proc. Advances in Neural Information Processing Systems, pp. 1473-1480, 2006.
[38] K. Weinberger and G. Tesauro, "Metric Learning for Kernel Regression," Proc. 11th Int'l Conf. Artificial Intelligence and Statistics, pp. 608-615, 2007.
[39] L. Wu, X.-S. Hua, N. Yu, W.-Y. Ma, and S. Li, "Flickr Distance," Proc. 16th ACM Int'l Conf. Multimedia (MM '08), pp. 31-40, 2008.
[40] L. Wu, R. Jin, S.C.-H. Hoi, J. Zhu, and N. Yu, "Learning Bregman Distance Functions and Its Application for Semi-Supervised Clustering," Proc. Advances in Neural Information Processing Systems (NIPS), pp. 2089-2097, 2009.
[41] L. Wu, L. Yang, N. Yu, and X.-S. Hua, "Learning to Tag," Proc. 18th Int'l World Wide Web (WWW) Conf., pp. 361-361, Apr. 2009.
[42] E.P. Xing, A.Y. Ng, M.I. Jordan, and S. Russell, "Distance Metric Learning with Application to Clustering with Side-Information," Proc. Advances in Neural Information Processing Systems (NIPS), pp. 521-528, 2002.
[43] L. Yang, R. Jin, R. Sukthankar, and Y. Liu, "An Efficient Algorithm for Local Distance Metric Learning," Proc. 22nd Conf. Artificial Intelligence (AAAI), pp. 543-548, 2006.
[44] L. Yang, R. Jin, L.B. Mummert, R. Sukthankar, A. Goode, B. Zheng, S.C.H. Hoi, and M. Satyanarayanan, "A Boosting Framework for Visuality-Preserving Distance Metric Learning and Its Application to Medical Image Retrieval," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 32, no. 1, pp. 30-44, 2010.
21 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool