The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.04 - April (2008 vol.30)
pp: 712-727
ABSTRACT
We investigate whether dimensionality reduction using a latent generative model is beneficial forthe task of weakly supervised scene classification. In detail we are given a set of labelled images ofscenes (e.g. coast, forest, city, river, etc) and our objective is to classify a new image into one ofthese categories. Our approach consists of first discovering latent "topics" using probabilistic LatentSemantic Analysis (pLSA), a generative model from the statistical text literature here applied to a bagof visual words representation for each image, and subsequently training a multi-way classifier on thetopic distribution vector for each image. We compare this approach to that of representing each imageby a bag of visual words vector directly, and training a multi-way classifier on these vectors.To this end we introduce a novel vocabulary using dense colour SIFT descriptors, and then investigatethe classification performance under changes in the size of the visual vocabulary, the number oflatent topics learnt, and the type of discriminative classifier used (k-nearest neighbour or SVM). Weachieve superior classification performance to recent publications that have used a bag of visual wordrepresentation, in all cases using the authors' own datasets and testing protocols. We also investigatethe gain in adding spatial information. We show applications to image retrieval with relevance feedbackand to scene classification in videos.
INDEX TERMS
Scene Classification, pLSA, Spatial Information
CITATION
Anna Bosch, Andrew Zisserman, Xavier Muñoz, "Scene Classification Using a Hybrid Generative/Discriminative Approach", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.30, no. 4, pp. 712-727, April 2008, doi:10.1109/TPAMI.2007.70716
REFERENCES
[1] A.C. Berg, “Shape Matching and Object Recognition,” PhD dissertation, Computer Science Division, Univ. of California, 2005.
[2] A. Bosch, X. Muñoz, and J. Freixenet, “Segmentation and Description of Natural Outdoor Scenes,” Image and Vision Computing, vol. 25, no. 5, pp. 727-740, May 2006.
[3] A. Bosch, X. Muñoz, and R. Marti, “A Review: Which Is the Best Way to Organize/Classify Images by Content,” Image and Vision Computing, vol. 25, no. 6, pp. 778-791, June 2007.
[4] A. Bosch, A. Zisserman, and X. Muñoz, “Scene Classification via pLSA,” Proc. European Conf. Computer Vision, vol. 4, pp. 517-530, May. 2006.
[5] C. Chang and C. Lin, LIBSVM: A Library for Support Vector Machines, 2001.
[6] J.L. Crowley and F. Berard, “Multi-Modal Tracking of Faces for Video Communication,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, pp. 640-645, 1997.
[7] G. Csurka, C. Bray, C. Dance, and L. Fan, “Visual Categorization with Bags of Keypoints,” Proc. Workshop Statistical Learning in Computer Vision, pp. 1-22, May 2004.
[8] N. Dalal and B. Triggs, “Histograms of Oriented Gradients for Human Detection,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 886-893, June 2005.
[9] L. Fei-Fei and P. Perona, “A Bayesian Hierarchical Model for Learning Natural Scene Categories,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, pp. 524-531, 2005.
[10] R. Fergus, L. Fei-Fei, P. Perona, and A. Zisserman, “Learning Object Categories from Google's Image Search,” Proc. Int'l Conf. Computer Vision, vol. II, pp. 1816-1823, Oct. 2005.
[11] G.D. Finlayson, B. Schiele, and J.L. Crowley, “Comprehensive Colour Normalization,” Proc. European Conf. Computer Vision, vol. 1, pp. 475-490, 1998.
[12] T. Geodeme, T. Tuytelaars, G. Vanacker, M. Nuttin, and L. Van Gool, “Omnidirectional Sparse Visual Path Following with Occlusion-Robust Feature Tracking,” Proc. OMNIVIS Workshop, Int'l Conf. Computer Vision, vol. 3115, pp. 207-215, Oct. 2005.
[13] K. Grauman and T. Darrell, “The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features,” Proc. Int'l Conf. Computer Vision, pp. 1458-1465, 2005.
[14] K. Grauman and T. Darrell, “Pyramid Match Kernels: Discriminative Classification with Sets of Image Features (Version 2),” Technical Report CSAIL-TR-2006-020, Computer Science and Artificial Intelligence Laboratory, Massachusetts Inst. of Tech nology, 2006.
[15] T. Hofmann, “Probabilistic Latent Semantic Indexing,” Proc. SIGIR Conf. Research and Development in Information Retrieval, 1998.
[16] T. Hofmann, “Unsupervised Learning by Probabilistic Latent Semantic Analysis,” Machine Learning, vol. 41, no. 2, pp. 177-196, 2001.
[17] S. Lazebnik, C. Schmid, and J. Ponce, “A Sparse Texture Representation Using Affine-Invariant Regions,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 319-324, June 2003.
[18] S. Lazebnik, C. Schmid, and J. Ponce, “Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 2169-2178, June 2006.
[19] T. Leung and J. Malik, “Representing and Recognizing the Visual Appearance of Materials Using Three-Dimensional Textons,” Int'l J. Computer Vision, vol. 43, no. 1, pp. 29-44, June 2001.
[20] F. Li, R. Fergus, and P. Perona, “Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories,” Proc. Workshop Generative-Model Based Vision, p. 178, June 2004.
[21] D. Lowe, “Distinctive Image Features from Scale Invariant Keypoints,” Int'l J. Computer Vision, vol. 60, no. 2, pp. 91-110, 2004.
[22] K. Mikolajczyk and C. Schmid, “Scale and Affine Invariant Interest Point Detectors,” Int'l J. Computer Vision, vol. 60, no. 1, pp. 63-86, 2004.
[23] J. Mutch and D. Lowe, “Multiclass Object Recognition Using Sparse, Localized Features,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 11-18, June 2006.
[24] A. Oliva and A. Torralba, “Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope,” Int'l J. Computer Vision, vol. 42, no. 3, pp. 145-175, 2001.
[25] P. Quelhas, F. Monay, J.M. Odobez, D. Gatica-Perez, T. Tuytelaars, and L. Van Gool, “Modeling Scenes with Local Descriptors and Latent Aspects,” Proc. Int'l Conf. Computer Vision, pp. 883-890, Oct. 2005.
[26] J.J. Rocchio, “Relevance Feedback in Information Retrieval,” The SMART Retrieval System—Experiments in Automatic Document Processing. Prentice Hall, 1971.
[27] J. Sivic, B.C. Russell, A.A. Efros, A. Zisserman, and W.T. Freeman, “Discovering Objects and Their Locations in Images,” Proc. Int'l Conf. Computer Vision, pp. 370-377, Oct. 2005.
[28] J. Sivic and A. Zisserman, “Video Google: A Text Retrieval Approach to Object Matching in Videos,” Proc. Int'l Conf. Computer Vision, vol. 2, pp. 1470-1477, Oct. 2003.
[29] M. Szummer and R.W. Picard, “Indoor-Outdoor Image Classification,” Proc. ICCV Workshop Content-Based Access of Image and Video Databases, pp. 42-50, 1998.
[30] A. Vailaya, A. Figueiredo, A. Jain, and H. Zhang, “Image Classification for Content-Based Indexing,” IEEE Trans. Image Processing, vol. 10, pp. 117-129, 2001.
[31] M. Varma and A. Zisserman, “Texture Classification: Are Filter Banks Necessary?” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 691-698, June 2003.
[32] J. Vogel and B. Schiele, “Semantic Modeling of Natural Scenes for Content-Based Image Retrieval,” Int'l J. Computer Vision, vol. 72, no. 2, pp. 133-157, Jan. 2007.
[33] G. Wang, Y. Zhang, and L. Fei-Fei, “Using Dependent Regions for Object Categorization in a Generative Framework,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 1597-1604, June 2006.
[34] J. Weijer and C. Schmid, “Coloring Local Feature Extraction,” Proc. European Conf. Computer Vision, vol. 2, pp. 332-348, May 2006.
[35] H. Zhang, A. Berg, M. Maire, and J. Malik, “SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 2126-2136, June 2006.
[36] J. Zhang, M. Marszałek, C. Lazebnik, and S. Schmid, “Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study,” Int'l J. Computer Vision, 2007.
[37] R. Zhang and Z. Zhang, “Hidden Semantic Concept Discovery in Region Based Image Retrieval,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 996-1001, June 2004.
29 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool