The Community for Technology Leaders
Computer Vision, IEEE International Conference on (2011)
Barcelona, Spain
Nov. 6, 2011 to Nov. 13, 2011
ISBN: 978-1-4577-1101-5
pp: 161-168
Ryan Farrell , University of Maryland, College Park, USA
Om Oza , University of Maryland, College Park, USA
Ning Zhang , University of California, Berkeley, USA
Vlad I. Morariu , University of Maryland, College Park, USA
Trevor Darrell , University of California, Berkeley, USA
Larry S. Davis , University of Maryland, College Park, USA
ABSTRACT
Subordinate-level categorization typically rests on establishing salient distinctions between part-level characteristics of objects, in contrast to basic-level categorization, where the presence or absence of parts is determinative. We develop an approach for subordinate categorization in vision, focusing on an avian domain due to the fine-grained structure of the category taxonomy for this domain. We explore a pose-normalized appearance model based on a volumetric poselet scheme. The variation in shape and appearance properties of these parts across a taxonomy provides the cues needed for subordinate categorization. Training pose detectors requires a relatively large amount of training data per category when done from scratch; using a subordinate-level approach, we exploit a pose classifier trained at the basic-level, and extract part appearance and shape information to build subordinate-level models. Our model associates the underlying image pattern parameters used for detection with corresponding volumetric part location, scale and orientation parameters. These parameters implicitly define a mapping from the image pixels into a pose-normalized appearance space, removing view and pose dependencies, facilitating fine-grained categorization from relatively few training examples.
INDEX TERMS
CITATION

Ning Zhang, L. S. Davis, T. Darrell, R. Farrell, O. Oza and V. I. Morariu, "Birdlets: Subordinate categorization using volumetric primitives and pose-normalized appearance," 2011 IEEE International Conference on Computer Vision (ICCV 2011)(ICCV), Barcelona, 2011, pp. 161-168.
doi:10.1109/ICCV.2011.6126238
189 ms
(Ver 3.3 (11022016))