The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.10 - Oct. (2012 vol.34)
pp: 1978-1991
Devi Parikh , Toyota Technological Institute Chicago, Chicago
C. Lawrence Zitnick , Microsoft Research, Redmond
Tsuhan Chen , Cornell University, Ithaca
ABSTRACT
Typically, object recognition is performed based solely on the appearance of the object. However, relevant information also exists in the scene surrounding the object. In this paper, we explore the roles that appearance and contextual information play in object recognition. Through machine experiments and human studies, we show that the importance of contextual information varies with the quality of the appearance information, such as an image's resolution. Our machine experiments explicitly model context between object categories through the use of relative location and relative scale, in addition to co-occurrence. With the use of our context model, our algorithm achieves state-of-the-art performance on the MSRC and Corel data sets. We perform recognition tests for machines and human subjects on low and high resolution images, which vary significantly in the amount of appearance information present, using just the object appearance information, the combination of appearance and context, as well as just context without object appearance information (blind recognition). We also explore the impact of the different sources of context (co-occurrence, relative-location, and relative-scale). We find that the importance of different types of contextual information varies significantly across data sets such as MSRC and PASCAL.
INDEX TERMS
Context awareness, Image resolution, Image segmentation, Human factors, Image recognition, Context modeling, Computational modeling, human studies., Object recognition, context, tiny images, blind recognition, image labeling
CITATION
Devi Parikh, C. Lawrence Zitnick, Tsuhan Chen, "Exploring Tiny Images: The Roles of Appearance and Contextual Information for Machine and Human Object Recognition", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.34, no. 10, pp. 1978-1991, Oct. 2012, doi:10.1109/TPAMI.2011.276
REFERENCES
[1] R. Fergus, P. Perona, and A. Zisserman, "Object Class Recognition by Unsupervised Scale-Invariant Learning," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2003.
[2] P. Felzenszwalb and D. Huttenlocher, "Pictorial Structures for Object Recognition," Int'l J. Computer Vision, vol. 61, pp. 55-79, 2005.
[3] L. Fei-Fei, R. Fergus, and P. Perona, "Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories," Proc. Workshop Generative-Model Based Vision, 2004.
[4] G. Griffin, A. Holub, and P. Perona, "The Caltech-256 Object Category Dataset," Caltech technical report, 2007.
[5] A. Rabinovich, A. Vedaldi, C. Galleguillos, E. Wiewiora, and S. Belongie, "Objects in Context," Proc. IEEE Int'l Conf. Computer Vision, 2007.
[6] D. Hoiem, A. Efros, and M. Hebert, "Putting Objects in Perspective," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2006.
[7] A. Torralba, K. Murphy, and W. Freeman, "Contextual Models for Object Detection Using Boosted Random Fields," Proc. Neural Information Processing Systems, 2005.
[8] A. Torralba and P. Sinha, "Statistical Context Priming for Object Detection," Proc. Eighth IEEE Int'l Conf. Computer Vision, 2001.
[9] K. Murphy, A. Torralba, and W. Freeman, "Using the Forest to See the Trees: A Graphical Model Relating Features, Objects, and Scenes," Proc. Neural Information Processing System, 2003.
[10] X. He, R. Zemel, and M. Carreira-Perpinan, "Multiscale Conditional Random Fields for Image Labeling," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2004.
[11] J. Shotton, J. Winn, C. Rother, and A. Criminisi, "TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-Class Object Recognition and Segmentation," Proc. European Conf. Computer Vision, 2006.
[12] P. Carbonetto, N. Freitas, and K. Barnard, "A Statistical Model for General Contextual Object Recognition," Proc. European Conf. Computer Vision, 2004.
[13] M. Fink and P. Perona, "Mutual Boosting for Contextual Inference," Proc. Neural Information Processing Systems, 2003.
[14] S. Kumar and M. Hebert, "A Hierarchical Field Framework for Unified Context-Based Classification," Proc. 10th IEEE Int'l Conf. Computer Vision, 2005.
[15] A. Singhal, J. Luo, and W. Zhu, "Probabilistic Spatial Context Models for Scene Content Understanding," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2003.
[16] B. Bose and E. Grimson, "Improving Object Classification in Far-Field Video," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2004.
[17] A. Torralba, K. Murphy, W. Freeman, and M. Rubin, "Context-Based Vision System for Place and Object Recognition," Proc. Ninth IEEE Int'l Conf. Computer Vision, 2003.
[18] The PASCAL Visual Object Classes Challenge, http://www. pascal-network.org/challenges/ VOCvoc2007, 2012.
[19] A. Torralba, R. Fergus, and W. Freeman, "80 Million Tiny Images: A Large Dataset for Non-Parametric Object and Scene Recognition," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 11, pp. 1958-1970, Nov. 2008.
[20] MSRC 21-class dataset, http://research.microsoft.com/vision/cambridge recognition/, 2012.
[21] Corel subset, http://www.cs.toronto.edu/~hexmlabel.htm , 2012.
[22] A. Oliva and A. Torralba, "The Role of Context in Object Recognition," Trends Cognitive Science, vol. 11, pp. 520-527, 2007.
[23] C. Galleguillos and S. Belongie, "Context Based Object Categorization: A Critical Survey," Computer Vision and Image Understanding, vol. 114, pp. 712-722, 2010.
[24] L. Wolf and S. Bileschi, "A Critical View of Context," Int'l J. Computer Vision, vol. 69, pp. 251-261, 2006.
[25] D. Parikh, C.L. Zitnick, and T. Chen, "Determining Patch Saliency Using Low-Level Context," Proc. 10th European Conf. Computer Vision, 2008.
[26] P. Carbonetto, N. de Freitas, and K. Barnard, "A Statistical Model for General Contextual Object Recognition," Proc. Eighth European Conf. Computer Vision, 2004.
[27] C. Galleguillos, A. Rabinovich, and S. Belongie, "Object Categorization Using Co-Ocurrence, Location and Appearance," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[28] D. Parikh and T. Chen, "Hierarchical Semantics of Objects (hSOs)," Proc. 11th IEEE Conf. Computer Vision, 2007.
[29] A. Gallagher and T. Chen, "Estimating Age, Gender and Identity Using First Name Priors," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[30] B. Yao and L. Fei-Fei, "Modeling Mutual Context of Object and Human Pose in Human-Object Interaction Activities," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010.
[31] G. Heitz and D. Koller, "Learning Spatial Context: Using Stuff to Find Things," Proc. European Conf. Computer Vision, 2008.
[32] Y.J. Lee and K. Grauman, "Object-Graphs for Context-Aware Category Discovery," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010.
[33] C. Galleguillos, B. McFee, S. Belongie, and G. Lanckriet, "Multi-Class Object Localization by Combining Local Contextual Interactions," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010.
[34] S. Divvala, D. Hoiem, J. Hays, A. Efros, and M. Hebert, "An Empirical Study of Context in Object Detection," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2009.
[35] A. Efros, A. Berg, G. Mori, and J. Malik, "Recognizing Action at a Distance," Proc. Ninth IEEE Int'l Conf. Computer Vision, 2003.
[36] T. Bachmann, "Identification of Spatially Queatized Tachistoscopic Images of Faces: How Many Pixels Does It Take to Carry Identity?" European J. Cognitive Psychology, vol. 3, pp. 85-103, 1991.
[37] L. Harmon and B. Julesz, "Masking in Visual Recognition: Effects of Two-Dimensional Noise," Science, vol. 180, pp. 1194-1197, 1973.
[38] A. Oliva, "Gist of the Scene," Neurobiology of Attention, L. Itti, G. Rees, and J. Tsotsos, eds., Academic Press, 2005.
[39] A. Oliva and P. Schyns, "Diagnostic Colors Mediate Scene Recognition," Cognitive Psychology, vol. 41, pp. 176-210, 2000.
[40] D. Parikh and C.L. Zitnick, "The Role of Features, Algorithms and Data in Visual Recognition," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010.
[41] D. Parikh and C.L. Zitnick, "Finding the Weakest Link in Person Detectors," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2011.
[42] D. Parikh, C.L. Zitnick, and T. Chen, "From Appearance to Context-Based Recognition: Dense Labeling in Small Images," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[43] T. Malisiewicz and A. Efros, "Improving Spatial Support for Objects via Multiple Segmentations," Proc. British Machine Vision Conf., 2007.
[44] A. Rabinovich, A. Vedaldi, and S. Belongie, "Does Image Segmentation Improve Object Categorization?" technical report, Univ. of California, San Diego, 2007.
[45] J. Shotton, http://jamie.shotton.org/workcode.html, TextonBoost code, 2012.
[46] T. Meltzer, http://www.cs.huji.ac.il/~talyaminference.html , Inference Package for Undirected Graphical Models, 2012.
[47] T. Malisiewicz and A.A. Efros, "Beyond Categories: The Visual Memex Model for Reasoning about Object Relationships," Proc. Neural Information Processing Systems, 2009.
[48] P. Felzenszwalb and D. Huttenlocher, "Efficient Graph-Based Image Segmentation," Int'l J. Computer Vision, vol. 59, pp. 167-181, 2004.
[49] L. Yang, P. Meer, and D. Foran, "Multiple Class Segmentation Using a Unified Framework over Mean-Shift Patches," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2007.
[50] J. Verbeek and B. Triggs, "Region Classification with Markov Field Aspect Models," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2007.
[51] X. He, R. Zemel, and D. Ray, "Learning and Incorporating Top-Down Cues in Image Segmentation," Proc. Ninth European Conf. Computer Vision, 2006.
[52] S. Gould, J. Rodgers, D. Cohen, G. Elidan, and D. Koller, "Multi-Class Segmentation with Relative Location Prior," Int'l J. Computer Vision, vol. 80, pp. 300-316, 2008.
[53] L. Ladicky, C. Russell, P. Kohli, and P.H.S. Torr, "Associative Hierarchical CRFs for Object Class Image Segmentation," Proc. 12th IEEE Int'l Conf. Computer Vision, 2009.
[54] M.J. Choi, J. Lim, A. Torralba, and A.S. Willsky, "Exploiting Hierarchical Context on a Large Database of Object Categories," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010.
[55] C. Desai, D. Ramanan, and C. Fowlkes, "Discriminative Models for Multi-Class Object Layout," Proc. 12th IEEE Int'l Conf. Computer Vision, 2009.
[56] B. Russell, A. Torralba, K. Murphy, and W. Freeman, "Labelme: A Database and Web-Based Tool for Image Annotation," Int'l J. Computer Vision, vol. 77, pp. 157-173, 2005.
[57] D. Lin, A. Kapoor, G. Hua, and S. Baker, "Joint People, Event, and Location Recognition in Personal Photo Collections Using Cross-Domain Context," Proc. 11th European Conf. Computer Vision, 2010.
[58] C. Li, D. Parikh, and T. Chen, "Extracting Adaptive Contextual Cues from Unlabeled Regions," Proc. IEEE Int'l Conf. Computer Vision, 2011.
31 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool