Issue No. 07 - July (2008 vol. 30)
Generic Visual Categorization (GVC) is the pattern classification problem which consists in assigning labels to an image based on its semantic content. This is a challenging task as one has to deal with inherent object/scene variations as well as changes in viewpoint, lighting and occlusion. Several state-of-the-art GVC systems use a vocabulary of visual terms to characterize images with a histogram of visual word counts. We propose a novel practical approach to GVC based on a universal vocabulary, which describes the content of all the considered classes of images, and class vocabularies obtained through the adaptation of the universal vocabulary using class-specific data. The main novelty is that an image is characterized by a set of histograms - one per class - where each histogram describes whether the image content is best modeled by the universal vocabulary or the corresponding class vocabulary. This framework is applied to two types of local image features: low-level descriptors such as the popular SIFT and high-level histograms of word co-occurrences in a spatial neighborhood. It is shown experimentally on two challenging datasets (an in-house database of 19 categories and the PASCAL VOC 2006 dataset) that the proposed approach exhibits state-of-the-art performance at a modest computational cost.
Object recognition, Scene Analysis, General
Florent Perronnin, "Universal and Adapted Vocabularies for Generic Visual Categorization", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 30, no. , pp. 1243-1256, July 2008, doi:10.1109/TPAMI.2007.70755