The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.11 - Nov. (2013 vol.35)
pp: 2624-2637
T. Mensink , ISLA Lab., Univ. of Amsterdam, Amsterdam, Netherlands
J. Verbeek , LEAR Team, INRIA Grenoble, Grenoble, France
F. Perronnin , Xerox Res. Centre Eur. Grenoble, Meylan, France
G. Csurka , Xerox Res. Centre Eur. Grenoble, Meylan, France
ABSTRACT
We study large-scale image classification methods that can incorporate new classes and training images continuously over time at negligible cost. To this end, we consider two distance-based classifiers, the k-nearest neighbor (k-NN) and nearest class mean (NCM) classifiers, and introduce a new metric learning approach for the latter. We also introduce an extension of the NCM classifier to allow for richer class representations. Experiments on the ImageNet 2010 challenge dataset, which contains over 106 training images of 1,000 classes, show that, surprisingly, the NCM classifier compares favorably to the more flexible k-NN classifier. Moreover, the NCM performance is comparable to that of linear SVMs which obtain current state-of-the-art performance. Experimentally, we study the generalization performance to classes that were not used to learn the metrics. Using a metric learned on 1,000 classes, we show results for the ImageNet-10K dataset which contains 10,000 classes, and obtain performance that is competitive with the current state-of-the-art while being orders of magnitude faster. Furthermore, we show how a zero-shot class prior based on the ImageNet hierarchy can improve performance when few training images are available.
INDEX TERMS
Measurement, Training, Support vector machine classification, Covariance matrices, Image classification, Image retrieval, Training data,image retrieval, Metric learning, k-nearest neighbors classification, nearest class mean classification, large scale image classification, transfer learning, zero-shot learning
CITATION
T. Mensink, J. Verbeek, F. Perronnin, G. Csurka, "Distance-Based Image Classification: Generalizing to New Classes at Near-Zero Cost", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.35, no. 11, pp. 2624-2637, Nov. 2013, doi:10.1109/TPAMI.2013.83
REFERENCES
[1] J. Sánchez and F. Perronnin, "High-Dimensional Signature Compression for Large-Scale Image Classification," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2011.
[2] Y. Lin, F. Lv, S. Zhu, M. Yang, T. Cour, K. Yu, L. Cao, and T. Huang, "Large-Scale Image Classification: Fast Feature Extraction and SVM Training," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2011.
[3] J. Weston, S. Bengio, and N. Usunier, "WSABIE: Scaling Up to Large Vocabulary Image Annotation," Proc. 22nd Int'l Joint Conf. Artificial Intelligence, 2011.
[4] S. Bengio, J. Weston, and D. Grangier, "Label Embedding Trees for Large Multi-Class Tasks," Proc. Conf. Neural Information Processing Systems, 2011.
[5] T. Gao and D. Koller, "Discriminative Learning of Relaxed Hierarchy for Large-Scale Visual Recognition," Proc. IEEE Int'l Conf. Computer Vision, 2011.
[6] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, "ImageNet: A Large-Scale Hierarchical Image Database," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2009.
[7] J. Deng, A. Berg, K. Li, and L. Fei-Fei, "What Does Classifying More than 10,000 Image Categories Tell Us?" Proc. 11th European Conf. Computer Vision, 2010.
[8] M. Guillaumin, T. Mensink, J. Verbeek, and C. Schmid, "Tagprop: Discriminative Metric Learning in Nearest Neighbor Models for Image Auto-Annotation," Proc. IEEE Int'l Conf. Computer Vision, 2009.
[9] O. Boiman, E. Shechtman, and M. Irani, "In Defense of Nearest-Neighbor Based Image Classification," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[10] A.R. Webb, Statistical Pattern Recognition. Wiley, 2002.
[11] K. Weinberger and L. Saul, "Distance Metric Learning for Large Margin Nearest Neighbor Classification," J. Machine Learning Research, vol. 10, pp. 207-244, 2009.
[12] G. Checkik, V. Sharma, U. Shalit, and S. Bengio, "Large Scale Online Learning of Image Similarity through Ranking," J. Machine Learning Research, vol. 11, pp. 1109-1135, 2010.
[13] L. Bottou, "Large-Scale Machine Learning with Stochastic Gradient Descent," Proc. 19th Int'l Conf. Computational Statistics, 2010.
[14] R. Gray and D. Neuhoff, "Quantization," IEEE Trans. Information Theory, vol. 44, no. 6, pp. 2325-2383, Oct. 1998.
[15] H. Jégou, M. Douze, and C. Schmid, "Product Quantization for Nearest Neighbor Search," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 33, no. 1, pp. 117-128, Jan. 2011.
[16] F. Perronnin, J. Sánchez, and T. Mensink, "Improving the Fisher Kernel for Large-Scale Image Classification," Proc. 11th European Conf. Computer Vision, 2010.
[17] T. Mensink, J. Verbeek, F. Perronnin, and G. Csurka, "Metric Learning for Large Scale Image Classification: Generalizing to New Classes at Near-Zero Cost," Proc. 12th European Conf. Computer Vision, 2012.
[18] G. Csurka, C. Dance, L. Fan, J. Willamowski, and C. Bray, "Visual Categorization with Bags of Keypoints," Proc. Int'l Workshop Statistical Learning in Computer Vision, 2004.
[19] J. Zhang, M. Marszałek, S. Lazebnik, and C. Schmid, "Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study," Int'l J. Computer Vision, vol. 73, no. 2, pp. 213-238, 2007.
[20] H. Jégou, F. Perronnin, M. Douze, J. Sánchez, P. Pérez, and C. Schmid, "Aggregating Local Image Descriptors into Compact Codes," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 34, no. 9, pp. 1704-1716, Sept. 2012.
[21] E. Nowak and F. Jurie, "Learning Visual Similarity Measures for Comparing Never Seen Objects," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2007.
[22] M. Guillaumin, J. Verbeek, and C. Schmid, "Is That You? Metric Learning Approaches for Face Identification," Proc. IEEE Int'l Conf. Computer Vision, 2009.
[23] M. Köstinger, M. Hirzer, P. Wohlhart, P.M. Roth, and H. Bischof, "Large Scale Metric Learning from Equivalence Constraints," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2012.
[24] B. Bai, J. Weston, D. Grangier, R. Collobert, Y. Qi, K. Sadamasa, O. Chapelle, and K. Weinberger, "Learning to Rank with (a Lot of) Word Features," Information Retrieval, vol. 13, pp. 291-314, 2010.
[25] J. Chai, H. Liua, B. Chenb, and Z. Baoa, "Large Margin Nearest Local Mean Classifier," Signal Processing, vol. 90, no. 1, pp. 236-248, 2010.
[26] A. Globerson and S. Roweis, "Metric Learning by Collapsing Classes," Proc. Conf. Neural Information Processing Systems, 2006.
[27] J. Goldberger, S. Roweis, G. Hinton, and R. Salakhutdinov, "Neighbourhood Component Analysis," Proc. Conf. Neural Information Processing Systems, 2005.
[28] C. Veenman and D. Tax, "LESS: A Model-Based Classifier for Sparse Subspaces," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 9, pp. 1496-1500, Sept. 2005.
[29] K. Weinberger and O. Chapelle, "Large Margin Taxonomy Embedding for Document Categorization," Proc. Conf. Neural Information Processing Systems, 2009.
[30] X. Zhou, X. Zhang, Z. Yan, S.-F. Chang, M. Hasegawa-Johnson, and T. Huang, "Sift-Bag Kernel for Video Event Analysis," Proc. 16th ACM Int'l Conf. Multimedia, 2008.
[31] Z. Wang, Y. Hu, and L.-T. Chia, "Image-to-Class Distance Metric Learning for Image Classification," Proc. 11th European Conf. Computer Vision, 2010.
[32] L. Fei-Fei, R. Fergus, and P. Perona, "One-Shot Learning of Object Categories," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, no. 4, pp. 594-611, Apr. 2006.
[33] C. Lampert, H. Nickisch, and S. Harmeling, "Learning to Detect Unseen Object Classes by Between-Class Attribute Transfer," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2009.
[34] T. Tommasi and B. Caputo, "The More You Know, the Less You Learn: From Knowledge Transfer to One-Shot Learning of Object Categories," Proc. British Machine Vision Conf., 2009.
[35] H. Larochelle, D. Erhan, and Y. Bengio, "Zero-Data Learning of New Tasks," Proc. AAAI Conf. Artificial Intelligence, 2008.
[36] K. Saenko, B. Kulis, M. Fritz, and T. Darrell, "Adapting Visual Category Models to New Domains," Proc. 11th European Conf. Computer Vision, 2010.
[37] S. Parameswaran and K. Weinberger, "Large Margin Multi-Task Metric Learning," Proc. Conf. Neural Information Processing Systems, 2010.
[38] M. Rohrbach, M. Stark, and B. Schiele, "Evaluating Knowledge Transfer and Zero-Shot Learning in a Large-Scale Setting," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2011.
[39] A. Lucchi and J. Weston, "Joint Image and Word Sense Discrimination for Image Retrieval," Proc. 12th European Conf. Computer Vision, 2012.
[40] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge Univ. Press, 2004.
[41] T. Mensink, J. Verbeek, F. Perronnin, and G. Csurka, "Large Scale Metric Learning for Distance-Based Image Classification," Research Report RR-8077, INRIA, http://hal.inria.frhal-00735908. 2012.
[42] D. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints," Int'l J. Computer Vision, vol. 60, no. 2, pp. 91-110, 2004.
[43] S. Clinchant, G. Csurka, F. Perronnin, and J.-M. Renders, "XRCE's Participation to ImagEval," Proc. ImageEval Workshop at CVIR, 2007.
[44] F. Perronnin, Z. Akata, Z. Harchaoui, and C. Schmid, "Towards Good Practice in Large-Scale Learning for Image Classification," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2012.
[45] Q. Le, M. Ranzato, R. Monga, M. Devin, K. Chen, G. Corrado, J. Dean, and A. Ng, "Building High-Level Features Using Large Scale Unsupervised Learning," Proc. Int'l Conf. Machine Learning, 2012.
[46] J.-L. Gauvain and C.-H. Lee, "Maximum A Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains," IEEE Trans. Speech and Audio Processing, vol. 2, no. 2, pp. 291-298, Apr. 1994.
[47] A. Gordo, J. Rodríguez, F. Perronnin, and E. Valveny, "Leveraging Category-Level Labels for Instance-Level Image Retrieval," Proc. IEEE Conf. Computer Vision and Pattern Recognition , 2012.
[48] H. Jégou, M. Douze, and C. Schmid, "Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search," Proc. 10th European Conf. Computer Vision, 2008.
[49] D. Nistér and H. Stewénius, "Scalable Recognition with a Vocabulary Tree," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2006.
71 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool