The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.02 - Feb. (2013 vol.35)
pp: 476-489
T. Mensink , LEAR Team, INRIA Rhone-Alpes, Montbonnot, France
J. Verbeek , LEAR Team, INRIA Rhone-Alpes, Montbonnot, France
G. Csurka , Xerox Res. Centre Eur. Grenoble, Meylan, France
ABSTRACT
We propose structured prediction models for image labeling that explicitly take into account dependencies among image labels. In our tree-structured models, image labels are nodes, and edges encode dependency relations. To allow for more complex dependencies, we combine labels in a single node and use mixtures of trees. Our models are more expressive than independent predictors, and lead to more accurate label predictions. The gain becomes more significant in an interactive scenario where a user provides the value of some of the image labels at test time. Such an interactive scenario offers an interesting tradeoff between label accuracy and manual labeling effort. The structured models are used to decide which labels should be set by the user, and transfer the user input to more accurate predictions on other image labels. We also apply our models to attribute-based image classification, where attribute predictions of a test image are mapped to class probabilities by means of a given attribute-class mapping. Experimental results on three publicly available benchmark datasets show that in all scenarios our structured models lead to more accurate predictions, and leverage user input much more effectively than state-of-the-art independent models.
INDEX TERMS
Predictive models, Vectors, Labeling, Image edge detection, Pattern recognition, Kernel, Training,statistical pattern recognition, Pattern recognition application computer vision, pattern recognition interactive systems, object recognition, content analysis and indexing
CITATION
T. Mensink, J. Verbeek, G. Csurka, "Tree-Structured CRF Models for Interactive Image Labeling", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.35, no. 2, pp. 476-489, Feb. 2013, doi:10.1109/TPAMI.2012.100
REFERENCES
[1] J. Zhang, M. Marszałek, S. Lazebnik, and C. Schmid, "Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study," Int'l J. Computer Vision, vol. 73, no. 2, pp. 213-238, 2007.
[2] D. Grangier and S. Bengio, "A Discriminative Kernel-Based Model to Rank Images from Text Queries," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 8, pp. 1371-1384, Aug. 2008.
[3] M. Guillaumin, T. Mensink, J. Verbeek, and C. Schmid, "Tagprop: Discriminative Metric Learning in Nearest Neighbor Models for Image Auto-Annotation," Proc. 12th IEEE Int'l Conf. Computer Vision, 2009.
[4] M. Choi, J. Lim, A. Torralba, and A. Willsky, "Exploiting Hierarchical Context on a Large Database of Object Categories," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010.
[5] S. Nowak and M. Huiskes, "New Strategies for Image Annotation: Overview of the Photo Annotation Task at Imageclef 2010," Proc. Working Notes of CLEF, 2010.
[6] C. Lampert, H. Nickisch, and S. Harmeling, "Learning to Detect Unseen Object Classes by Between-Class Attribute Transfer," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2009.
[7] S. Branson, C. Wah, F. Schroff, B. Babenko, P. Welinder, P. Perona, and S. Belongie, "Visual Recognition with Humans in the Loop," Proc. 11th European Conf. Computer Vision, 2010.
[8] T. Mensink, J. Verbeek, and G. Csurka, "Learning Structured Prediction Models for Interactive Image Labeling," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2011.
[9] V. Vapnik, The Nature of Statistical Learning Theory. Springer, 1995.
[10] F. Perronnin, J. Sánchez, and T. Mensink, "Improving the Fisher Kernel for Large-Scale Image Classification," Proc. 11th European Conf. Computer Vision, 2010.
[11] A. Makadia, V. Pavlovic, and S. Kumar, "A New Baseline for Image Annotation," Proc. 10th European Conf. Computer Vision, 2008.
[12] J. Weston, S. Bengio, and N. Usunier, "Large Scale Image Annotation: Learning to Rank with Joint Word-Image Embeddings," Proc. European Conf. Machine Learning, 2010.
[13] J. Deng, A. Berg, K. Li, and F.-F. Li, "What Does Classifying More Than 10000 Image Categories Tell Us?" Proc. 11th European Conf. Computer Vision, 2010.
[14] C. Desai, D. Ramanan, and C. Fowlkes, "Discriminative Models for Multi-Class Object Layout," Proc. IEEE Int'l Conf. Computer Vision, 2009.
[15] A. Rabinovich, A. Vedaldi, C. Galleguillos, E. Wiewiora, and S. Belongie, "Objects in Context," Proc. IEEE Int'l Conf. Computer Vision, 2007.
[16] B. Settles, "Active Learning Literature Survey," technical report, Univ. of Wisconsin–Madison, p. 1648, 2009.
[17] S. Vijayanarasimhan and K. Grauman, "Multi-Level Active Prediction of Useful Image Annotations For Recognition," Proc. Neural Information Processing System, 2009.
[18] I. Tsochantaridis, T. Joachims, T. Hofmann, and Y. Altun, "Large Margin Methods for Structured and Interdependent Output Variables," J. Machine Learning Research, vol. 6, pp. 1453-1484, 2005.
[19] C. Bishop, Pattern Recognition and Machine Learning. Springer, 2006.
[20] S. Nowozin and C. Lampert, "Structured Learning and Prediction in Computer Vision," Foundations and Trends in Computer Graphics and Vision, vol. 6, pp. 185-365, 2011.
[21] G. Csurka, C. Dance, L. Fan, J. Willamowski, and C. Bray, "Visual Categorization with Bags of Keypoints," Proc. ECCV Workshop Statistical Learning in Computer Vision, 2004.
[22] A. Oliva and A. Torralba, "Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope," Int'l J. Computer Vision, vol. 42, pp. 145-175, 2001.
[23] J. Bradley and C. Guestrin, "Learning Tree Conditional Random Fields," Proc. Int'l Conf. Machine Learning, 2010.
[24] C. Chow and C. Liu, "Approximating Discrete Probability Distributions with Dependence Trees," IEEE Trans. Information Theory, vol. 14, no. 3, pp. 462-467, May 1968.
[25] P. Pletscher, C. Ong, and J. Buhmann, "Spanning Tree Approximations for Conditional Random Fields," Proc. 12th Int'l Conf. Artificial Intelligence and Statistics, 2009.
[26] M. Huiskes and M. Lew, "The MIR Flickr Retrieval Evaluation," Proc. First ACM Int'l Conf. Multimedia Information Retrieval, 2008.
[27] S. Lazebnik, C. Schmid, and J. Ponce, "Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories," Proc. IEEE Computer Vision and Pattern Recognition, 2006.
[28] T. Mensink, G. Csurka, F. Perronnin, J. Sánchez, and J. Verbeek, "LEAR and XRCE's Participation to Visual Concept Detection Task—ImageCLEF 2010," Proc. Workshop ImageCLEF, 2010.
[29] M. Everingham, L. Van Gool, C.K.I. Williams, J. Winn, and A. Zisserman, "The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Results," http://www.pascal-network.org/ challenges/ VOC/voc2007/workshopindex.html, 2012.
[30] J. Platt, "Probabilities for SV Machines," Advances in Large Margin Classifiers, MIT Press, 2000.
[31] S. Nowozin, P. Gehler, and C. Lampert, "On Parameter Learning in CRF-Based Approaches to Object Class Image Segmentation," Proc. 11th European Conf. Computer Vision, 2010.
[32] K. van de Sande and T. Gevers, "The University of Amsterdam's Concept Detection System at ImageCLEF 2010," Proc. Workshop ImageCLEF, 2010.
[33] E. Mbanya, C. Hentschel, S. Gerke, M. Liu, A. Nürnberger, and P. Ndjiki-Nya, "Augmenting Bag-of-Words—Category Specific Features and Concept Reasoning," Proc. Workshop ImageCLEF, 2010.
[34] I. Dimitrovski, D. Kocev, S. Loskovska, and S. Džeroski, "Detection of Visual Concepts and Annotation of Images Using Predictive Clustering Trees," Proc. Workshop ImageCLEF, 2010.
[35] N. Motohashi, R. Izawa, and T. Takagi, "Meiji University at ImageCLEF2010 Visual Concept Detection and Annotation Task," Proc. Workshop ImageCLEF, 2010.
48 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool