The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.02 - February (2012 vol.34)
pp: 240-252
Myung Jin Choi , Two Sigma Investments, New York, NY, USA
Antonio Torralba , Comput. Sci. & Artificial Intell. Lab., Massachusetts Inst. of Technol., Cambridge, MA, USA
Alan S. Willsky , Lab. for Inf. & Decision Syst., Massachusetss Inst. of Technol., Cambridge, MA, USA
ABSTRACT
There has been a growing interest in exploiting contextual information in addition to local features to detect and localize multiple object categories in an image. A context model can rule out some unlikely combinations or locations of objects and guide detectors to produce a semantically coherent interpretation of a scene. However, the performance benefit of context models has been limited because most of the previous methods were tested on data sets with only a few object categories, in which most images contain one or two object categories. In this paper, we introduce a new data set with images that contain many instances of different object categories, and propose an efficient model that captures the contextual information among more than a hundred object categories using a tree structure. Our model incorporates global image features, dependencies between object categories, and outputs of local detectors into one probabilistic framework. We demonstrate that our context model improves object recognition performance and provides a coherent interpretation of a scene, which enables a reliable image querying system by multiple object categories. In addition, our model can be applied to scene understanding tasks that local detectors alone cannot solve, such as detecting objects out of context or querying for the most typical and the least typical scenes in a data set.
INDEX TERMS
probability, object recognition, image querying system, tree based context model, object recognition, contextual information, object categories, image features, probabilistic framework, Object recognition, Context modeling, Scene analysis, Object recognition, Computational modeling, Image processing, Markov processes, image databases., Object recognition, scene analysis, Markov random fields, structural models
CITATION
Myung Jin Choi, Antonio Torralba, Alan S. Willsky, "A Tree-Based Context Model for Object Recognition", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.34, no. 2, pp. 240-252, February 2012, doi:10.1109/TPAMI.2011.119
REFERENCES
[1] C.M. Bishop, Pattern Recognition and Machine Learning. Springer, 2006.
[2] C.K. Chow and C.N. Liu, "Approximating Discrete Probability Distributions with Dependence Trees," IEEE Trans. Information Theory, vol. 14, no. 3, pp. 462-467, May 1968.
[3] N. Dalal and B. Triggs, "Histograms of Oriented Gradients for Human Detection," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2005.
[4] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, "ImageNet: A Large-Scale Hierarchical Image Database," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2009.
[5] C. Desai, D. Ramanan, and C. Fowlkes, "Discriminative Models for Multi-Class Object Layout," Proc. 12th IEEE Int'l Conf. Computer Vision, 2009.
[6] S.K. Divvala, D. Hoiem, J.H. Hays, A.A. Efros, and M. Hebert, "An Empirical Study of Context in Object Detection," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2009.
[7] M. Everingham, L. Van Gool, C.K.I. Williams, J. Winn, and A. Zisserman, "The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Results," http://www.pascal-network.org/ challenges/ VOC/voc2007/workshopindex.html , 2011.
[8] P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan, "Object Detection with Discriminatively Trained Part Based Models," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 32, no. 9, pp. 1627-1645, Sept. 2010.
[9] R. Fergus, P. Perona, and A. Zisserman, "Object Class Recognition by Unsupervised Scale-Invariant Learning," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2003.
[10] C. Galleguillos, A. Rabinovich, and S. Belongie, "Object Categorization Using Co-Occurrence, Location and Appearance," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[11] S. Gould, J. Rodgers, D. Cohen, G. Elidan, and D. Koller, "Multi-Class Segmentation with Relative Location Prior," Int'l J. Computer Vision, vol. 80, pp. 300-316, 2007.
[12] X. He, R.S. Zemel, and M.Á. Carreira-Perpinñán, "Multiscale Conditional Random Fields for Image Labeling," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2004.
[13] G. Heitz and D. Koller, "Learning Spatial Context: Using Stuff to Find Things," Proc. 10th European Conf. Computer Vision, 2008.
[14] G. Heitz, S. Gould, A. Saxena, and D. Koller, "Cascaded Classification Models: Combining Models for Holistic Scene Understanding," Proc. Neural Information Processing Systems, 2008.
[15] D. Hoiem, A. Efros, and M. Hebert, "Putting Objects in Perspective," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2006.
[16] Y. Jin and S. Geman, "Context and Hierarchy in a Probabilistic Image Model," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2006.
[17] J.-F. Lalonde, D. Hoiem, A.A. Efros, C. Rother, J. Winn, and A. Criminisi, "Photo Clip Art," ACM Trans. Graphics, vol. 26, no. 3, p. 3, Aug. 2007.
[18] L.-J. Li, R. Socher, and L. Fei-Fei, "Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2009.
[19] T. Malisiewicz and A.A. Efros, "Beyond Categories: The Visual Memex Model for Reasoning about Object Relationships," Proc. Neural Information Processing Systems, 2009.
[20] M. Marszałek and C. Schmid, "Semantic Hierarchies for Visual Object Recognition," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2007.
[21] K.P. Murphy, A. Torralba, and W.T. Freeman, "Using the Forest to See the Trees: A Graphical Model Relating Features, Objects and Scenes," Proc. Neural Information Processing Systems, 2003.
[22] D. Parikh and T. Chen, "Hierarchical Semantics of Objects (hSOs)," Proc. IEEE Int'l Conf. Computer Vision, 2007.
[23] J. Porway, K. Wang, B. Yao, and S.C. Zhu, "A Hierarchical and Contextual Model for Aerial Image Understanding," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[24] A. Rabinovich, A. Vedaldi, C. Galleguillos, E. Wiewiora, and S. Belongie, "Objects in Context," Proc. IEEE Int'l Conf. Computer Vision, 2007.
[25] B.C. Russell, A. Torralba, K.P. Murphy, and W.T. Freeman, "LabelMe: A Database and Web-Based Tool for Image Annotation," Int'l J. Computer Vision, vol. 77, pp. 157-173, 2008.
[26] E.B. Sudderth, A. Torralba, W.T. Freeman, and A.S. Willsky, "Learning Hierarchical Models of Scenes, Objects, and Parts," Proc. IEEE Int'l Conf. Computer Vision, 2005.
[27] A. Torralba, K.P. Murphy, and W.T. Freeman, "Contextual Models for Object Detection Using Boosted Random Fields," Advances in Neural Information Processing Systems, MIT Press, 2005.
[28] A. Torralba, "Contextual Priming for Object Detection," Int'l J. Computer Vision, vol. 53, pp. 169-191, 2003.
[29] Z. Tu, "Auto-Context and Its Application to High-Level Vision Tasks," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[30] J. Winn, A. Criminisi, and T. Minka, "Object Categorization by Learned Universal Visual Dictionary," Proc. IEEE Int'l Conf. Computer Vision, 2005.
18 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool