The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.11 - Nov. (2012 vol.34)
pp: 2177-2188
Gang Wang , Sch. of Electr. & Electron. Eng., Nanyang Technol. Univ., Singapore, Singapore
D. Hoiem , Dept. of Comput. Sci., Univ. of Illinois at Urbana-Champaign, Urbana, IL, USA
D. Forsyth , Dept. of Comput. Sci., Univ. of Illinois at Urbana-Champaign, Urbana, IL, USA
ABSTRACT
Measuring image similarity is a central topic in computer vision. In this paper, we propose to measure image similarity by learning from the online Flickr image groups. We do so by: Choosing 103 Flickr groups, building a one-versus-all multiclass classifier to classify test images into a group, taking the set of responses of the classifiers as features, calculating the distance between feature vectors to measure image similarity. Experimental results on the Corel dataset and the PASCAL VOC 2007 dataset show that our approach performs better on image matching, retrieval, and classification than using conventional visual features. To build our similarity measure, we need one-versus-all classifiers that are accurate and can be trained quickly on very large quantities of data. We adopt an SVM classifier with a histogram intersection kernel. We describe a novel fast training algorithm for this classifier: the Stochastic Intersection Kernel MAchine (SIKMA) training algorithm. This method can produce a kernel classifier that is more accurate than a linear classifier on tens of thousands of examples in minutes.
INDEX TERMS
Kernel, Training, Support vector machines, Histograms, Visualization, Feature extraction, Euclidean distance, image organization, Image similarity, kernel machines, stochastic gradient descent, online learning, image classification
CITATION
Gang Wang, D. Hoiem, D. Forsyth, "Learning Image Similarity from Flickr Groups Using Fast Kernel Machines", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.34, no. 11, pp. 2177-2188, Nov. 2012, doi:10.1109/TPAMI.2012.29
REFERENCES
[1] V.P.A. Makadia and S. Kumar, "A New Baseline for Image Annotation," Proc. European Conf. Computer Vision, 2008.
[2] K. Barnard, P. Duygulu, and D.A. Forsyth, "Clustering Art," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 434-441, 2001.
[3] K. Barnard and D.A. Forsyth, "Learning the Semantics of Words and Pictures," Proc. IEEE Int'l Conf. Computer Vision, pp. 408-415, 2001.
[4] S. Belongie, J. Malik, and J. Puzicha, "Shape Matching and Object Recognition Using Shape Contexts," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 4, pp. 509-522, Apr. 2002.
[5] L. Bottou, "Stochastic Learning," Advanced Lectures on Machine Learning, pp. 146-168, Springer, 2004.
[6] L. Bottou, Large-Scale Kernel Machines. MIT Press, 2007.
[7] C.C. Chang and C.J. Lin, "LIBSVM: A Library for Support Vector Machines," vol. 80, software, http://www.csie.ntu.edu.tw/cjlinlibsvm, pp. 604-611, 2001.
[8] J. Cui, F. Wen, R. Xiao, Y. Tian, and X. Tang, "EasyAlbum: An Interactive Photo Annotation System Based on Face Clustering and Re-Ranking," Proc. SIGCHI Conf. Human Factors in Computing Systems, pp. 367-376, 2007.
[9] N. Dalai, B. Triggs, I. Rhone-Alps, and F. Montbonnot, "Histograms of Oriented Gradients for Human Detection," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 1, 2005.
[10] R. Datta, D. Joshi, J. Li, and J.Z. Wang, "Image Retrieval: Ideas, Influences, and Trends of the New Age," Technical Report CSE 06-009, Pennsylvania State Univ., 2006.
[11] J. Deng, W. Dong, R. Socher, L.J. Li, K. Li, and L. Fei-Fei, "Imagenet: A Large-Scale Hierarchical Image Database," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2009.
[12] M. Everingham, L. Van Gool, C.K.I. Williams, J. Winn, and A. Zisserman, "The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Results," Proc. Int'l Joint Conf. Neural Networks, 2007.
[13] R. Fergus, L. Fei-Fei, P. Perona, and A. Zisserman, "Learning Object Categories from Google's Image Search," Proc. IEEE Int'l Conf. Computer Vision, vol. 2, pp. 1816-1823, Oct. 2005.
[14] K. Grauman and T. Darrell, "The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features," Proc. IEEE Int'l Conf. Computer Vision, vol. 1, no. 2, p. 3, 2005.
[15] C.E. Jacobs, A. Finkelstein, and D.H. Salesin, "Fast Multiresolution Image Querying," Proc. ACM Siggraph '95, pp. 277-285, 1995.
[16] A.E. Johnson and M. Hebert, "Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 21, no. 5, pp. 433-449, May 1999.
[17] J. Kivinen, A.J. Smola, and R.C. Williamson, "Online Learning with Kernels," IEEE Trans. Signal Processing, vol. 52, no. 8, pp. 2165-2176, Aug. 2004.
[18] N. Kumar et al., "Attribute and Simile Classifiers for Face Verification" Proc. IEEE Int'l Conf. Computer Vision, 2009.
[19] S. Lazebnik, C. Schmid, and J. Ponce, "Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2006.
[20] T. Leung and J. Malik, "Representing and Recognizing the Visual Appearance of Materials Using Three-Dimensional Textons," Int'l J. Computer Vision, vol. 43, no. 1, pp. 29-44, June 2001.
[21] L.J. Li and L. Fei-Fei, "Optimol: Automatic Online Picture Collection via Incremental Model Learning," Int'l J. Computer Vision, vol. 88, no. 2, pp. 147-168, 2010.
[22] L.J. Li, H. Su, E.P. Xing, and L. Fei-Fei, "Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification," Proc. Neural Information Processing Systems, 2010.
[23] Y. Liu, D. Zhang, G. Lu, and W.Y. Ma, "A Survey of Content-Based Image Retrieval with High-Level Semantics," Pattern Recognition, vol. 40, no. 1, pp. 262-282, 2007.
[24] N. Loeff and A. Farhadi, "Scene Discovery by Matrix Factorization," Proc. 10th European Conf. Computer Vision: Part IV, pp. 451-464, 2008.
[25] D.G. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints," Int'l J. Computer Vision, vol. 60, no. 2, pp. 91-110, 2004.
[26] S. Maji and A.C. Berg, "Max-Margin Additive Classifiers for Detection," Proc. IEEE Int'l Conf. Computer Vision, 2009.
[27] S. Maji, A.C. Berg, and J. Malik, "Classification Using Intersection Kernel Support Vector Machines Is Efficient," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[28] A. Makadia, V. Pavlovic, and S. Kumar, "A New Baseline for Image Annotation," Proc. European Conf. Computer Vision, 2008.
[29] A. Oliva and A. Torralba, "Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope," Int'l J. Computer Vision, vol. 42, pp. 145-175, 2001.
[30] J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, "Object Retrieval with Large Vocabularies and Fast Spatial Matching," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1-8, 2007.
[31] J.C. Platt, "Fast Training of Support Vector Machines Using Sequential Minimal Optimization," Advances in Kernel Methods: Support Vector Learning, MIT Press, 1999.
[32] N. Rasiwasia, P.J. Moreno, and N. Vasconcelos, "Bridging the Gap: Query by Semantic Example," IEEE Trans. Multimedia, vol. 9, no. 5, pp. 923-938, Aug. 2007.
[33] Y. Rubner, C. Tomasi, and L.J. Guibas, "A Metric for Distributions with Applications to Image Databases," Proc. IEEE Int'l Conf. Computer Vision, 1998.
[34] Y. Rui, T.S. Huang, M. Ortega, and S. Mehrotra, "Relevance Feedback: A Power Tool for Interactive Content-Based Image Retrieval," IEEE Trans. Circuits and Systems for Video Technology, vol. 8, no. 5, pp. 644-655, Sept. 1998.
[35] B.C. Russell, A. Torralba, K.P. Murphy, and W.T. Freeman, "LabelMe: A Database Web-Based Tool for Image Annotation," Int'l J. Computer Vision, vol. 77, pp. 157-173, 2008.
[36] C. Schmid and R. Mohr, "Local Grayvalue Invariants for Image Retrieval," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 5, pp. 530-535, May 1997.
[37] F. Schroff, A. Criminisi, and A. Zisserman, "Harvesting Image Databases from the Web," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 33, no. 4, pp. 754-766, Apr. 2011.
[38] S. Shalev-Shwartz, Y. Singer, and N. Srebro, "Pegasos: Primal Estimated Sub-Gradient Solver for SVM," Proc. 24th Int'l Conf. Machine Learning, pp. 807-814, 2007.
[39] N. Snavely, S.M. Seitz, and R. Szeliski, "Photo Tourism: Exploring Photo Collections in 3D," Proc. ACM Siggraph '06, pp. 835-846, 2006.
[40] M.J. Swain and D.H. Ballard, "Color Indexing," Int'l J. Computer Vision, vol. 7, no. 1, pp. 11-32, 1991.
[41] M.J. Swain and D.H. Ballard, "Color Indexing," Int'l J. Computer Vision, vol. 7, no. 1, pp. 11-32, 1991.
[42] Y. Tian, W. Liu, R. Xiao, F. Wen, and X. Tang, "A Face Annotation Framework with Partial Clustering and Interactive Labeling," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1-8, 2007.
[43] K. Tieu and P. Viola, "Boosting Image Retrieval," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 228-235, 2000.
[44] A. Torralba, "Contextual Priming for Object Detection," Int'l J. Computer Vision, vol. 53, no. 2, pp. 169-191, 2003.
[45] L. Torresani, M. Szummer, and A. Fitzgibbon, "Efficient Object Category Recognition Using Classemes," Proc. European Conf. Computer Vision, pp. 776-789, 2010.
[46] M. Varma and A. Zisserman, "A Statistical Approach to Texture Classification from Single Images," Int'l J. Computer Vision, vol. 62, no. 1, pp. 61-81, 2005.
[47] G. Wang and D. Forsyth, "Object Image Retrieval by Exploiting Online Knowledge Resources," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1-8, 2008.
[48] G. Wang, D. Forsyth, and D. Hoiem, "Comparative Object Similarity for Improved Recognition with Few or No Examples," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010.
[49] G. Wang, A. Gallagher, J. Luo, and D. Forsyth, "Seeing People in Social Context: Recognizing People and Social Relationships," Proc. European Conf. Computer Vision, pp. 169-182, 2010.
[50] G. Wang, D. Hoiem, and D. Forsyth, "Building Text Features for Object Image Classification," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2009.
[51] G. Wang, D. Hoiem, and D. Forsyth, "Learning Image Similarity from Flickr Groups Using Stochastic Intersection Kernel Machines," Proc. IEEE Int'l Conf. Computer Vision, 2009.
[52] J. Zhang, M. Marszalek, S. Lazebnik, and C. Schmid, "Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study," Int'l J. Computer Vision, vol. 73, no. 2, pp. 213-238, 2007.
[53] L. Zhang, Y. Hu, M. Li, W. Ma, and H. Zhang, "Efficient Propagation for Face Annotation in Family Albums," Proc. 12th Ann. ACM Int'l Conf. Multimedia, pp. 716-723, 2004.
49 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool