The Community for Technology Leaders
Green Image
Issue No. 03 - March (2010 vol. 32)
ISSN: 0162-8828
pp: 501-516
Joachim M. Buhmann , ETH Zurich, Zurich
Björn Ommer , University of California at Berkeley, Berkeley
ABSTRACT
Real-world scene understanding requires recognizing object categories in novel visual scenes. This paper describes a composition system that automatically learns structured, hierarchical object representations in an unsupervised manner without requiring manual segmentation or manual object localization. A central concept for learning object models in the challenging, general case of unconstrained scenes, large intraclass variations, large numbers of categories, and lacking supervision information is to exploit the compositional nature of our (visual) world. The compositional nature of visual objects significantly limits their representation complexity and renders learning of structured object models statistically and computationally tractable. We propose a robust descriptor for local image parts and show how characteristic compositions of parts can be learned that are based on an unspecific part vocabulary shared between all categories. Moreover, a Bayesian network is presented that comprises all the compositional constituents together with scene context and object shape. Object recognition is then formulated as a statistical inference problem in this probabilistic model.
INDEX TERMS
Image categorization, object recognition, compositionality, graphical models, visual learning.
CITATION
Joachim M. Buhmann, Björn Ommer, "Learning the Compositional Nature of Visual Object Categories for Recognition", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 32, no. , pp. 501-516, March 2010, doi:10.1109/TPAMI.2009.22
108 ms
(Ver )