The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.08 - August (2011 vol.33)
pp: 1489-1501
Jianxin Wu , Nanyang Technological University, Singapore
James M. Rehg , Georgia Institute of Technology, Atlanta
ABSTRACT
CENsus TRansform hISTogram (CENTRIST), a new visual descriptor for recognizing topological places or scene categories, is introduced in this paper. We show that place and scene recognition, especially for indoor environments, require its visual descriptor to possess properties that are different from other vision domains (e.g., object recognition). CENTRIST satisfies these properties and suits the place and scene recognition task. It is a holistic representation and has strong generalizability for category recognition. CENTRIST mainly encodes the structural properties within an image and suppresses detailed textural information. Our experiments demonstrate that CENTRIST outperforms the current state of the art in several place and scene recognition data sets, compared with other descriptors such as SIFT and Gist. Besides, it is easy to implement and evaluates extremely fast.
INDEX TERMS
Place recognition, scene recognition, visual descriptor, Census Transform, SIFT, Gist.
CITATION
Jianxin Wu, James M. Rehg, "CENTRIST: A Visual Descriptor for Scene Categorization", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.33, no. 8, pp. 1489-1501, August 2011, doi:10.1109/TPAMI.2010.224
REFERENCES
[1] B. Kuipers and P. Beeson, "Bootstrap Learning for Place Recognition," Proc. AAAI Conf. Artificial Intelligence, pp. 174-180, 2002.
[2] S. Thrun, D. Fox, W. Burgard, and F. Dellaert, "Robust Monte Carlo Localization for Mobile Robots," Artificial Intelligence, vol. 128, nos. 1-2, pp. 99-141, 2001.
[3] H. Durrant-Whyte and T. Bailey, "Simultaneous Localization and Mapping: Part I," IEEE Robotics and Automation Magazine, vol. 13, no. 2, pp. 99-110, June 2006.
[4] S. Se, D.G. Lowe, and J.J. Little, "Vision-Based Mobile Robot Localization and Mapping Using Scale-Invariant Features," Proc. IEEE Int'l Conf. Robotics and Automation, pp. 2051-2058, 2001.
[5] I. Ulrich and I.R. Nourbakhsh, "Appearance-Based Place Recognition for Topological Localization," Proc. IEEE Int'l Conf. Robotics and Automation, pp. 1023-1029, 2006.
[6] H. Choset and K. Nagatani, "Topological Simultaneous Localization and Mapping (SLAM): Toward Exact Localization without Explicit Localization," IEEE Trans. Robotics and Automation, vol. 17, no. 2, pp. 125-137, Apr. 2001.
[7] A. Bosch, A. Zisserman, and X. Muñoz, "Scene Classification Using a Hybrid Generative/Discriminative Approach," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 4, pp. 712-727, Apr. 2008.
[8] L. Fei-Fei and P. Perona, "A Bayesian Hierarchical Model for Learning Natural Scene Categories," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 524-531, 2005.
[9] S. Lazebnik, C. Schmid, and J. Ponce, "Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, pp. 2169-2178, 2006.
[10] A. Oliva and A. Torralba, "Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope," Int'l J. Computer Vision, vol. 42, no. 3, pp. 145-175, 2001.
[11] P. Quelhas, F. Monay, J.-M. Odobez, D. Gatica-Perez, and T. Tuytelaars, "A Thousand Words in a Scene," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 9, pp. 1575-1589, Sept. 2007.
[12] J. Wu, H.I. Christensen, and J.M. Rehg, "Visual Place Categorization: Problem, Data Set, and Algorithm," Proc. IEEE/RSJ Int'l Conf. Intelligent Robots and Systems, 2009.
[13] J. Hays and A.A. Efros, "IM2GPS: Estimating Geographic Information from a Single Image," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2008.
[14] J.J. Kivinen, E.B. Sudderth, and M.I. Jordan, "Learning Multiscale Representation of Natural Scenes Using Dirichlet Processes," Proc. IEEE Conf. Computer Vision, 2007.
[15] L.-J. Li and L. Fei-Fei, "What, Where and Who? Classifying Events by Scene and Object Recognition," Proc. IEEE Int'l Conf. Computer Vision, 2007.
[16] J. Liu and M. Shah, "Scene Modeling Using Co-Clustering," Proc. IEEE Int'l Conf. Computer Vision, 2007.
[17] D. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints," Int'l J. Computer Vision, vol. 60, no. 2, pp. 91-110, 2004.
[18] N. Dalal and B. Triggs, "Histograms of Oriented Gradients for Human Detection," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 886-893, 2005.
[19] J. Wu and J.M. Rehg, "Where Am I: Place Instance and Category Recognition Using Spatial PACT," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2008.
[20] A. Pronobis, B. Caputo, P. Jensfelt, and H.I. Christensen, "A Discriminative Approach to Robust Visual Place Recognition," Proc. IEEE/RSJ Int'l Conf. Intelligent Robots and Systems, 2006.
[21] M. Szummer and R.W. Picard, "Indoor-Outdoor Image Classification," Proc. IEEE Int'l Workshop Content-Based Access of Image and Video Database, pp. 42-51, 1998.
[22] J.C. van Gemert, J.-M. Geusebroek, C.J. Veenman, and A.W. Smeulders, "Kernel Codebooks for Scene Categorization," Proc. European Conf. Computer Vision, 2008.
[23] K. Mikolajczyk and C. Schmid, "A Performance Evaluation of Local Descriptors," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 10, pp. 1615-1630, Oct. 2005.
[24] J. Vogel and B. Schiele, "Semantic Modeling of Natural Scenes for Content-Based Image Retrieval," Int'l J. Computer Vision, vol. 72, no. 2, pp. 133-157, 2007.
[25] R. Fergus, P. Perona, and A. Zisserman, "Object Class Recognition by Unsupervised Scale-Invariant Learning," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 264-271, 2003.
[26] P. Felzenszwalb and D. Huttenlocher, "Pictorial Structures for Object Recognition," Int'l J. Computer Vision, vol. 61, no. 1, pp. 55-79, 2005.
[27] J. Luo, A. Pronobis, B. Caputo, and P. Jensfelt, "The KTH-IDOL2 Database," Technical Report CVAP304, Kungliga Tekniska Hoegskolan, CVAP/CAS, Oct. 2006.
[28] R. Zabih and J. Woodfill, "Non-Parametric Local Transforms for Computing Visual Correspondence," Proc. European Conf. Computer Vision, vol. 2, pp. 151-158, 1994.
[29] T. Ojala, M. Pietikäinen, and T. Mäenpää, "Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 971-987, July 2002.
[30] L. Fei-Fei, R. Fergus, and P. Perona, "Learning Generative Visual Models from Few Training Example: An Incremental Bayesian Approach Tested on 101 Object Categories," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition Workshop Generative-Model Based Vision, 2004.
[31] P.F. Felzenszwalb and J.D. Schwartz, "Hierarchical Matching of Deformable Shapes," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2007.
[32] H. Ling and D.W. Jacobs, "Shape Classification Using the Inner-Distance," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 2, pp. 286-299, Feb. 2007.
[33] O.J.O. Söderkvist, "Computer Vision Classification of Leaves from Swedish Trees," master's thesis, Linköping Univ., 2001.
[34] A. Quattoni and A. Torralba, "Recognizing Indoor Scenes," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2009.
[35] A. Pronobis and B. Caputo, "The KTH-INDECS Database," Technical Report CVAP297, Kungliga Tekniska Hoegskolan, CVAP, Sept. 2005.
[36] J. Wolf, W. Burgard, and H. Burkhardt, "Robust Vision-Based Localization for Mobile Robots Using an Image Retrieval System Based on Invariant Features," Proc. IEEE Int'l Conf. Robotics and Automation, pp. 359-365, 2002.
[37] W.T. Freeman and M. Roth, "Orientation Histogram for Hand Gesture Recognition," Proc. IEEE Int'l Workshop Automatic Face and Gesture Recognition, pp. 296-301, 1995.
[38] M.J. Swain and D.H. Ballard, "Color Indexing," Int'l J. Computer Vision, vol. 7, no. 1, pp. 11-32, 1991.
[39] K.E.A. van de Sande, T. Gevers, and C.G.M. Snoek, "Evaluation of Color Descriptors for Objects and Scene Recognition," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2008.
33 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool