The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.03 - March (2014 vol.36)
pp: 493-506
Yongzhen Huang , Nat. Lab. of Pattern Recognition (NLPR), Inst. of Autom., Beijing, China
Zifeng Wu , Nat. Lab. of Pattern Recognition (NLPR), Inst. of Autom., Beijing, China
Liang Wang , Nat. Lab. of Pattern Recognition (NLPR), Inst. of Autom., Beijing, China
Tieniu Tan , Nat. Lab. of Pattern Recognition (NLPR), Inst. of Autom., Beijing, China
ABSTRACT
Image classification is a hot topic in computer vision and pattern recognition. Feature coding, as a key component of image classification, has been widely studied over the past several years, and a number of coding algorithms have been proposed. However, there is no comprehensive study concerning the connections between different coding methods, especially how they have evolved. In this paper, we first make a survey on various feature coding methods, including their motivations and mathematical representations, and then exploit their relations, based on which a taxonomy is proposed to reveal their evolution. Further, we summarize the main characteristics of current algorithms, each of which is shared by several coding strategies. Finally, we choose several representatives from different kinds of coding approaches and empirically evaluate them with respect to the size of the codebook and the number of training samples on several widely used databases (15-Scenes, Caltech-256, PASCAL VOC07, and SUN397). Experimental findings firmly justify our theoretical analysis, which is expected to benefit both practical applications and future research.
INDEX TERMS
Encoding, Image coding, Feature extraction, Vectors, Image classification, Image reconstruction, Manifolds,bag-of-features, Image classification, feature coding
CITATION
Yongzhen Huang, Zifeng Wu, Liang Wang, Tieniu Tan, "Feature Coding in Image Classification: A Comprehensive Study", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.36, no. 3, pp. 493-506, March 2014, doi:10.1109/TPAMI.2013.113
REFERENCES
[1] R. Collins, A. Lipton, T. Kanade, H. Fujuyoshi, D. Duggins, Y. Tsin, D. Tolliver, N. Enomoto, and O. Hasegawa, "A System for Video Surveillance and Monitoring," Technical Report CMU-RI-TR-00-12, Carnegie Mellon Univ., Pittsburgh, Penn., 2000.
[2] A. Vailaya, M.A.T. Figueiredo, A.K. Jain, and H.J. Zhang, "Image Classification for Content-Based Indexing," IEEE Trans. Image Processing, vol. 10, no. 1, pp. 117-130, Jan. 2001.
[3] R. Kosala and H. Blockeel, "Web Mining Research: A Survey," ACM SIGKDD Explorations Newsletter, vol. 2, no. 1, pp. 1-15, 2000.
[4] V.I. Pavlovic, R. Sharma, and T.S. Huang, "Visual Interpretation of Hand Gestures for Human-Computer Interaction: A Review," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 677-695, July 1997.
[5] A.K. Jain, A. Ross, and S. Prabhakar, "An Introduction to Biometric Recognition," IEEE Trans. Circuits and Systems for Video Technology, vol. 14, no. 1, pp. 4-20, Jan. 2004.
[6] G. Csurka, C. Bray, C. Dance, and L. Fan, "Visual Categorization with Bags of Keypoints," Proc. ECCV Int'l Workshop Statistical Learning in Computer Vision, 2004.
[7] T. Joachims, "Text Categorization with Support Vector Machines: Learning with Many Relevant Features," Proc. 10th European Conf. Machine Learning (ECML '98), 1998.
[8] http://www.cs.unc.edu/-lazebnik/research scene-categories. zip/, 2006.
[9] http://www.vision.caltech.edu/Image-DataSets Caltech256/, 2013.
[10] http://pascallin.ecs.soton.ac.uk/challenges voc/, 2012.
[11] http:/www.image-net.org/, 2013.
[12] M. Marszalek, C. Schmid, H. Harzallah, and J. van de Weijer, "Learning Representations for Visual Object Class Recognition," Proc. Visual Recognition Challenge Workshop in Conjunction with IEEE Int'l Conf. Computer Vision, 2007.
[13] C. Harris and M. Stephens, "A Combined Corner and Edge Detector," Proc. Fourth Alvey Vision Conf., pp. 147-151, 1988.
[14] T. Tuytelaars and L.V. Gool, "Matching Widely Separated Views Based on Affine Invariant Regions," Int'l J. Computer Vision, vol. 59, no. 1, pp. 61-85, 2004.
[15] J. Matas, O. Chum, M. Urban, and T. Pajdla, "Robust Wide-Baseline Stereo from Maximally Stable Extremal Regions," Image and Vision Computing, vol. 22, no. 10, pp. 761-767, 2004.
[16] D.G. Lowe, "Distinctive Image Features from Scale-Invariant Key-Points," Int'l J. Computer Vision, vol. 60, no. 2, pp. 91-110, 2004.
[17] T. Ojala, M. Petikainen, and D. Harwood, "A Comparative Study of Texture Measures with Classification Based on Feature Distributions," Pattern Recognition, vol. 29, pp. 51-59, 1996.
[18] N. Dalal and B. Triggs, "Histograms of Oriented Gradients for Human Detection," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR '05), 2005.
[19] K. Mikolajczyk and C. Schmid, "A Performance Evaluation of Local Descriptors," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 10, pp. 1615-1630, Oct. 2005.
[20] J.G. Zhang, M. Marszalek, S. Lazebnik, and C. Schmid, "Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study," Int'l J. Computer Vision, vol. 73, no. 2, pp. 213-238, 2007.
[21] S.P. Lloyd, "Least Squares Quantization in PCM," IEEE Trans. Information Theory, vol. IT-28, no. 2, pp. 129-137, Mar. 1982.
[22] J. Yang, K. Yu, and T. Huang, "Supervised Translation-Invariant Sparse Coding," Proc. IEEE Conf. Computer Vision and Pattern Recognition(CVPR), 2010.
[23] D.M. Bradley and J.A. Bagnell, "Differential Sparse Coding," Proc. Advances in Neural Information Processing Systems (NIPS), 2008.
[24] J. Mairal, F. Bach, J. Ponce, and G. Sapiro, "Supervised Dictionary Learning," Proc. Advances in Neural Information Processing Systems (NIPS), 2008.
[25] J. Yang, K. Yu, Y. Gong, and T. Huang, "Linear Spatial Pyramid Matching Using Sparse Coding for Image Classification," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2009.
[26] K. Yu, T. Wang, and Y. Gong, "Nonlinear Learning Using Local Coordinate Coding," Proc. Advances in Neural Information Processing Systems (NIPS), 2009.
[27] Z. Jiang, G. Zhang, and L. Davis, "Submodular Dictionary Learning for Sparse Coding," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2012.
[28] Y. Boureau, J. Ponce, and Y. Lecun, "A Theoretical Analysis of Feature Pooling in Visual Recognition," Proc. 27th Int'l Conf. Machine Learning (ICML), 2010.
[29] Y. Boureau, F. Bach, Y. LeCun, and J. Ponce, "Learning Mid-Level Features for Recognition," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2010.
[30] J. Gemert, J. Geusebroek, C. Veenman, and A. Smeulders, "Kernel Codebooks for Scene Categorization," Proc. 10th European Conf. Computer Vision: Part III (ECCV '08), 2008.
[31] F. Perronnin and C. Dance, "Fisher Kernels on Visual Vocabularies for Image Categorization," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2007.
[32] F. Perronnin, J. Sanchez, and T. Mensink, "Improving the Fisher Kernel for Large-Scale Image Classification," Proc. 11th European Conf. Computer Vision: Part IV (ECCV), 2010.
[33] J. Wang, J. Yang, K. Yu, F. Lv, T. Huang, and Y. Gong, "Locality-Constrained Linear Coding for Image Classification," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2010.
[34] K. Yu and T. Zhang, "Improved Local Coordinate Coding Using Local Tangents," Proc. Int'l Conf. Machine Learning (ICML), 2010.
[35] X. Zhou, K. Yu, T. Zhang, and T. Huang, "Image Classification Using Super-Vector Coding of Local Image Descriptors," Proc. 11th European Conf. Computer Vision: Part V (ECCV), 2010.
[36] Y. Huang, K. Huang, Y. Yu, and T. Tan, "Salient Coding for Image Classification," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2011.
[37] Z. Wu, Y. Huang, L. Wang, and T. Tan, "Group Encoding of Local Features in Image Classification," Proc. 21st Int'l Conf. Pattern Recognition (ICPR), 2012.
[38] K. Chatfield, V. Lempitsky, A. Vedaldi, and A. Zisserman, "The Devil Is in the Details: An Evaluation of Recent Feature Encoding Methods," Proc. British Machine Vision Conf. (BMVC), 2011.
[39] J. Gemert, J. Geusebroek, C. Veenman, and A. Smeulders, "Visual Word Ambiguity," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 32, no. 7, pp. 1271-1283, July 2010.
[40] L. Liu, L. Wang, and X. Liu, "In Defense of Soft-Assignment Coding," Proc. IEEE Int'l Conf. Computer Vision (ICCV), 2011.
[41] T. Jaakkola and D. Haussler, "Exploiting Generative Models in Discriminative Classifiers," Proc. Conf. Advances in Neural Information Processing Systems II (NIPS), 1999.
[42] F. Perronnin, Y. Liu, J. Sanchez, and H. Poirier, "Large-Scale Image Retrieval with Compressed Fisher Vectors," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2010.
[43] R.G. Cinbis, J. Verbeek, and C. Schmid, "Image Categorization Using Fisher Kernels of Non-Iid Image Models," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2012.
[44] H. Jegou, M. Douze, C. Schmid, and P. Perez, "Aggregating Local Descriptors into a Compact Image Representation," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2010.
[45] H. Jegou, F. Perronnin, M. Douze, J. Sanchez, P. Perez, and C. Schmid, "Aggregating Local Image Descriptors into Compact Codes," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 34, no. 9, pp. 1704-1716, Sept. 2012.
[46] D. Picard and P. Gosselin, "Improving Image Similarity with Vector of Locally Aggregated Tensors," Proc. 18th IEEE Int'l Conf. Image Processing (ICIP), 2011.
[47] G. McLachlan and D. Peel, Finite Mixture Models. John Wiley & Sons, 2000.
[48] S. Gao, I. Tsang, L. Chia, and P. Zhao, "Local Features Are Not Lonely—Laplacian Sparse Coding for Image Classification," Proc. European Conf. Computer Vision (ECCV), 2010.
[49] J. Yang, K. Yu, and T. Huang, "Efficient Highly Over-Complete Sparse Coding Using a Mixture Model," Proc. 11th European Conf. Computer Vision: Part V (ECCV), 2010.
[50] K. Naveen and B. Li, "Discriminative Affine Sparse Codes for Image Classification," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2011.
[51] C. Zhang, J. Liu, Q. Tian, C. Xu, H. Lu, and S. Ma, "Image Classification by Non-Negative Sparse Coding, Low-Rank and Sparse Decomposition," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2011.
[52] S. Gao, L. Chia, and I. Tsang, "Multi-Layer Group Sparse Coding for Concurrent Image Classification and Annotation," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2011.
[53] K. Yu, Y. Lin, and J. Lafferty, "Learning Image Representations from the Pixel Level via Hierarchical Sparse Coding," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2011.
[54] L. Cao, R. Ji, Y. Gao, Y. Yang, and Q. Tian, "Weakly Supervised Sparse Coding with Geometric Consistency Pooling," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2012.
[55] http://en.wikipedia.org/wikiParallelogram_law , 2013.
[56] S. Lazebnik, C. Schmid, and J. Ponce, "Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2006.
[57] http://pascallin.ecs.soton.ac.uk/challenges/ VOC/voc2007 index.html, 2013.
[58] J. Xiao, J. Hays, K. Ehinger, A. Oliva, and A. Torralba, "Sun Database: Large-Scale Scene Recognition from Abbey to Zoo," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2010.
[59] http:/www.vlfeat.org/, 2013.
[60] http://www.csie.ntu.edu.tw/cjlinliblinear /, 2013.
[61] J. van Gemert, J. Geusebroek, C. Veenman, and A. Smeulders, "Kernel Codebooks for Scene Categorization," Proc. 10th European Conf. Computer Vision: Part III (ECCV '08), 2008.
[62] J. Feng, B. Ni, Q. Tian, and S. Yan, "Geometric Lp-Norm Feature Pooling for Image Classification," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2011.
[63] Y. Jia, C. Huang, and T. Darrell, "Beyond Spatial Pyramids: Receptive Field Learning for Pooled Image Features," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2012.
[64] Y. Huang, K. Huang, C. Wang, and T. Tan, "Exploring Relations of Visual Words for Image Classification," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2011.
[65] A. Shabou and H. Borgne, "Locality-Constrained and Spatially Regularized Coding for Scene Categorization," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2012.
96 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool