The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.07 - July (2012 vol.34)
pp: 1342-1353
Xiaoou Tang , The Chinese University of Hong Kong, Hong Kong
Ke Liu , The Chinese University of Hong Kong, Hong Kong
Jingyu Cui , Stanford University, Stanford
Fang Wen , Microsoft Research Asia
Xiaogang Wang , The Chinese University of Hong Kong, Hong Kong
ABSTRACT
Web-scale image search engines (e.g., Google image search, Bing image search) mostly rely on surrounding text features. It is difficult for them to interpret users' search intention only by query keywords and this leads to ambiguous and noisy search results which are far from satisfactory. It is important to use visual information in order to solve the ambiguity in text-based image retrieval. In this paper, we propose a novel Internet image search approach. It only requires the user to click on one query image with minimum effort and images from a pool retrieved by text-based search are reranked based on both visual and textual content. Our key contribution is to capture the users' search intention from this one-click query image in four steps. 1) The query image is categorized into one of the predefined adaptive weight categories which reflect users' search intention at a coarse level. Inside each category, a specific weight schema is used to combine visual features adaptive to this kind of image to better rerank the text-based search result. 2) Based on the visual content of the query image selected by the user and through image clustering, query keywords are expanded to capture user intention. 3) Expanded keywords are used to enlarge the image pool to contain more relevant images. 4) Expanded keywords are also used to expand the query image to multiple positive visual examples from which new query specific visual and textual similarity metrics are learned to further improve content-based image reranking. All these steps are automatic, without extra effort from the user. This is critically important for any commercial web-based image search engine, where the user interface has to be extremely simple. Besides this key contribution, a set of visual features which are both effective and efficient in Internet image search are designed. Experimental evaluation shows that our approach significantly improves the precision of top-ranked images and also the user experience.
INDEX TERMS
Image search, intention, image reranking, adaptive similarity, keyword expansion.
CITATION
Xiaoou Tang, Ke Liu, Jingyu Cui, Fang Wen, Xiaogang Wang, "IntentSearch: Capturing User Intention for One-Click Internet Image Search", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.34, no. 7, pp. 1342-1353, July 2012, doi:10.1109/TPAMI.2011.242
REFERENCES
[1] F. Jing, C. Wang, Y. Yao, K. Deng, L. Zhang, and W. Ma, "Igroup: Web Image Search Results Clustering," Proc. 14th Ann. ACM Int'l Conf. Multimedia, 2006.
[2] J. Cui, F. Wen, and X. Tang, "Real Time Google and Live Image Search Re-Ranking," Proc. 16th ACM Int'l Conf. Multimedia, 2008.
[3] J. Cui, F. Wen, and X. Tang, "IntentSearch: Interactive On-Line Image Search Re-Ranking," Proc. 16th ACM Int'l Conf. Multimedia, 2008.
[4] "Bing Image Search," http://www.bing.comimages, 2012.
[5] N. Ben-Haim, B. Babenko, and S. Belongie, "Improving Web-Based Image Search via Content Based Clustering," Proc. Int'l Workshop Semantic Learning Applications in Multimedia, 2006.
[6] R. Fergus, P. Perona, and A. Zisserman, "A Visual Category Filter for Google Images," Proc. European Conf. Computer Vision, 2004.
[7] G. Park, Y. Baek, and H. Lee, "Majority Based Ranking Approach in Web Image Retrieval," Proc. Second Int'l Conf. Image and Video Retrieval, 2003.
[8] Y. Jing and S. Baluja, "Pagerank for Product Image Search," Proc. Int'l Conf. World Wide Web, 2008.
[9] W.H. Hsu, L.S. Kennedy, and S.-F. Chang, "Video Search Reranking via Information Bottleneck Principle," Proc. 14th Ann. ACM Int'l Conf. Multimedia, 2006.
[10] R. Datta, D. Joshi, and J.Z. Wang, "Image Retrieval: Ideas, Influences, and Trends of the New Age," ACM Computing Surveys, vol. 40, pp. 1-60, 2007.
[11] A. Torralba, K. Murphy, W. Freeman, and M. Rubin, "Context-Based Vision System for Place and Object Recognition," Proc. Int'l Conf. Computer Vision, 2003.
[12] N. Dalal and B. Triggs, "Histograms of Oriented Gradients for Human Detection," Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, 2005.
[13] D. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints," Int'l J. Computer Vision, vol. 60, no. 2, pp. 91-110, 2004.
[14] J. Sivic and A. Zisserman, "Video Google: A Text Retrieval Approach to Object Matching in Videos," Proc. Int'l Conf. Computer Vision, 2003.
[15] Y. Cao, C. Wang, Z. Li, L. Zhang, and L. Zhang, "Spatial-Bag-of-Features," Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, 2010.
[16] J. Philbin, M. Isard, J. Sivic, and A. Zisserman, "Descriptor Learning for Efficient Retrieval," Proc. European Conf. Computer Vision, 2010.
[17] Y. Zhang, Z. Jia, and T. Chen, "Image Retrieval with Geometry-Preserving Visual Phrases," Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, 2011.
[18] G. Chechik, V. Sharma, U. Shalit, and S. Bengio, "Large Scale Online Learning of Image Similarity through Ranking," J. Machine Learning Research, vol. 11, pp. 1109-1135, 2010.
[19] J. Deng, A.C. Berg, and L. Fei-Fei, "Hierarchical Semantic Indexing for Large Scale Image Retrieval," Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, 2011.
[20] K. Tieu and P. Viola, "Boosting Image Retrieval," Int'l J. Computer Vision, vol. 56, no. 1, pp. 17-36, 2004.
[21] S. Tong and E. Chang, "Support Vector Machine Active Learning for Image Retrieval," Proc. ACM Multimedia, 2001.
[22] Y. Chen, X. Zhou, and T. Huang, "One-Class SVM for Learning in Image Retrieval," Proc. IEEE Int'l Conf. Image Processing, 2001.
[23] Y. Lu, H. Zhang, L. Wenyin, and C. Hu, "Joint Semantics and Feature Based Image Retrieval Using Relevance Feedback," IEEE Trans. Multimedia, vol. 5, no. 3, pp. 339-347, Sept. 2003.
[24] D. Tao and X. Tang, "Random Sampling Based SVM for Relevance Feedback Image Retrieval," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2004.
[25] D. Tao, X. Tang, X. Li, and X. Wu, "Asymmetric Bagging and Random Subspace for Support Vector Machines-Based Relevance Feedback in Image Retrieval," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, no. 7, pp. 1088-1099, July 2006.
[26] T. Quack, U. Monich, L. Thiele, and B. Manjunath, "Cortina: A System for Large-Scale, Content-Based Web Image Retrieval," Proc. 12th Ann. ACM Int'l Conf. Multimedia, 2004.
[27] Y. Huang, Q. Liu, S. Zhang, and D.N. Metaxas, "Image Retrieval via Probabilistic Hypergraph Ranking," Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, 2011.
[28] R. Yan, E. Hauptmann, and R. Jin, "Multimedia Search with Pseudo-Relevance Feedback," Proc. Int'l Conf. Image and Video Retrieval, 2003.
[29] R. Yan, A.G. Hauptmann, and R. Jin, "Negative Pseudo-Relevance Feedback in Content-Based Video Retrieval," Proc. 11th ACM Int'l Conf. Multimedia, 2003.
[30] O. Chum, J. Philbin, J. Sivic, M. Isard, and A. Zisserman, "Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval," Proc. IEEE Int'l Conf. Computer Vision, 2007.
[31] J. Ah-Pine, M. Bressan, S. Clinchant, G. Csurka, Y. Hoppenot, and J. Renders, "Crossing Textual and Visual Content in Different Application Scenarios," Multimedia Tools and Applications, vol. 42, pp. 31-56, 2009.
[32] J. Krapac, M. Allan, J. Verbeek, and F. Jurie, "Improving Web Image Search Results Using Query-Relative Classifiers," Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, 2010.
[33] B. Luo, X. Wang, and X. Tang, "A World Wide Web Based Image Search Engine Using Text and Image Content Features," Proc. IS&T/SPIE Electronic Imaging, Internet Imaging IV, 2003.
[34] Z. Zha, L. Yang, T. Mei, M. Wang, and Z. Wang, "Visual Query Expansion," Proc. 17th ACM Int'l Conf. Multimedia, 2009.
[35] A. Natsev, A. Haubold, J. Tešić, L. Xie, and R. Yan, "Semantic Concept-Based Query Expansion and Re-Ranking for Multimedia Retrieval," Proc. 15th Int'l Conf. Multimedia, 2007.
[36] J. Smith, M. Naphade, and A. Natsev, "Multimedia Semantic Indexing Using Model Vectors," Proc. Int'l Conf. Multimedia and Expo., 2003.
[37] A. Frome, Y. Singer, F. Sha, and J. Malik, "Learning Globally-Consistent Local Distance Functions for Shape-Based Image Retrieval and Classification," Proc. IEEE Int'l Conf. Computer Vision, 2007.
[38] Y. Lin, T. Liu, and C. Fuh, "Local Ensemble Kernel Learning for Object Category Recognition," Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, 2007.
[39] S. Liu, F. Liu, C. Yu, and W. Meng, "An Effective Approach to Document Retrieval via Utilizing WordNet and Recognizing Phrases," Proc. 27th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval, 2004.
[40] S. Kim, H. Seo, and H. Rim, "Information Retrieval Using Word Senses: Root Sense Tagging Approach," Proc. 27th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval, 2004.
[41] K. Sparck Jones, Automatic Keyword Classification for Information Retrieval. Archon Books, 1971.
[42] S. Deerwester, S. Dumais, G. Furnas, T. Landauer, and R. Harshman, "Indexing by Latent Semantic Analysis," J. Am. Soc. for Information Science, vol. 41, no. 6, pp. 391-407, 1990.
[43] L. Wu, L. Yang, N. Yu, and X. Hua, "Learning to Tag," Proc. Int'l Conf. World Wide Web, 2009.
[44] C. Wang, F. Jing, L. Zhang, and H. Zhang, "Scalable Search-Based Image Annotation of Personal Images," Proc. Eighth ACM Int'l Workshop Multimedia Information Retrieval, 2006.
[45] R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval. Addison-Wesley Longman Publishing Co., 1999.
[46] M. Unser, "Texture Classification and Segmentation Using Wavelet Frames," IEEE Trans. Image Processing, vol. 4, no. 11, pp. 1549-1560, Nov. 1995.
[47] Y. Rubner, L. Guibas, and C. Tomasi, "The Earth Movers Distance, Multi-Dimensional Scaling, and Color-Based Image Retrieval," Proc. ARPA Image Understanding Workshop, 1997.
[48] T. Liu, J. Sun, N. Zheng, X. Tang, and H. Shum, "Learning to Detect a Salient Object," Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, 2007.
[49] W. Freeman and M. Roth, "Orientation Histograms for Hand Gesture Recognition," Proc. Int'l Workshop Automatic Face and Gesture Recognition, 1995.
[50] R. Xiao, H. Zhu, H. Sun, and X. Tang, "Dynamic Cascades for Face Detection," Proc. Int'l Conf. Computer Vision, 2007.
[51] Y. Freund, R. Iyer, R.E. Schapire, and Y. Singer, "An Efficient Boosting Algorithm for Combining Features," J. Machine Learning Research, vol. 4, pp. 933-969, 2003.
[52] X.S. Zhou and T.S. Huang, "Relevance Feedback in Image Retrieval: A Comprehensive Review," Multimedia Systems, vol. 8, pp. 536-544, 2003.
[53] J. He, M. Li, Z. Li, H. Zhang, H. Tong, and C. Zhang, "Pseudo Relevance Feedback Based on Iterative Probabilistic One-Class SVMs in Web Image Retrieval," Proc. Pacific-Rim Conf. Multimedia, 2004.
[54] Y. Ke, X. Tang, and F. Jing, "The Design of High-Level Features for Photo Quality Assessment," Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, 2006.
[55] Y. Luo and X. Tang, "Photo and Video Quality Evaluation: Focusing on the Subject," Proc. European Conf. Computer Vision, 2008.
[56] W. Luo, X. Wang, and X. Tang, "Content-Based Photo Quality Assessment," Proc. IEEE Int'l Conf. Computer Vision, 2011.
30 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool