The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.03 - March (2013 vol.35)
pp: 716-727
Lei Wu , Dept. of Comput. Sci., Univ. of Pittsburgh, Pittsburgh, PA, USA
Rong Jin , Dept. of Comput. Sci. & Eng., Michigan State Univ., East Lansing, MI, USA
A. K. Jain , Dept. of Comput. Sci. & Eng., Michigan State Univ., East Lansing, MI, USA
ABSTRACT
Many social image search engines are based on keyword/tag matching. This is because tag-based image retrieval (TBIR) is not only efficient but also effective. The performance of TBIR is highly dependent on the availability and quality of manual tags. Recent studies have shown that manual tags are often unreliable and inconsistent. In addition, since many users tend to choose general and ambiguous tags in order to minimize their efforts in choosing appropriate words, tags that are specific to the visual content of images tend to be missing or noisy, leading to a limited performance of TBIR. To address this challenge, we study the problem of tag completion, where the goal is to automatically fill in the missing tags as well as correct noisy tags for given images. We represent the image-tag relation by a tag matrix, and search for the optimal tag matrix consistent with both the observed tags and the visual similarity. We propose a new algorithm for solving this optimization problem. Extensive empirical studies show that the proposed algorithm is significantly more effective than the state-of-the-art algorithms. Our studies also verify that the proposed algorithm is computationally efficient and scales well to large databases.
INDEX TERMS
Visualization, Optimization, Image retrieval, Noise measurement, Vectors, Feature extraction, Correlation, metric learning, Tag completion, matrix completion, tag-based image retrieval, image annotation, image retrieval
CITATION
Lei Wu, Rong Jin, A. K. Jain, "Tag Completion for Image Retrieval", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.35, no. 3, pp. 716-727, March 2013, doi:10.1109/TPAMI.2012.124
REFERENCES
[1] A.B. Goldberg, X. Zhu, B. Recht, J. Xu, and R.D. Nowak, "Transduction with Matrix Completion: Three Birds with One Stone," Proc. Neural Information Processing Systems Foundation Conf., pp. 757-765, 2010.
[2] A. Gammerman, V. Vovk, and V. Vapnik, "Learning by Transduction," Proc. Conf. Uncertainty in Artificial Intelligence, pp. 148-155, 1998.
[3] A. Ioffe, "Composite Optimization: Second Order Conditions, Value Functions and Sensityvity," Analysis and Optimization of Systems, vol. 144, pp. 442-451, 1990.
[4] A. Makadia, V. Pavlovic, and S. Kumar, "A New Baseline for Image Annotation," Proc. 10th European Conf. Computer Vision, pp. 316-329, 2008.
[5] A. Rakhlin, "Applications of Empirical Processes in Learning Theory: Algorithmic Stability and Generalization Bounds," PhD thesis, MIT, 2006.
[6] A. Singhal, "Modern Information Retrieval: A Brief Overview," Bull. IEEE CS. Technical Committee Data Eng., vol. 24, pp. 35-42, 2001.
[7] A. Yavlinsky, E. Schofield, and S. Rger, "Automated Image Annotation Using Global Features and Robust Nonparametric Density Estimation," Proc. Int'l Conf. Image and Video Retrieval, pp. 507-517, 2005.
[8] B. Hariharan, S.V.N. Vishwanathan, and M. Varma, "Large Scale Max-Margin Multi-Label Classification with Prior Knowledge about Densely Correlated Labels," Proc. Int'l Conf. Machine Learning, 2010.
[9] B.-K. Bao, B. Ni, Y. Mu, and S. Yan, "Efficient Region-Aware Large Graph Construction towards Scalable Multi-Label Propagation," Pattern Recognition, vol. 44, pp. 598-606, 2011.
[10] B. Sigurbjörnsson and R. van Zwol, "Flickr Tag Recommendation Based on Collective Knowledge," Proc. 17th Int'l Conf. World Wide Web, pp. 327-336, 2008.
[11] B. Russell, A. Torralba, K. Murphy, and W. Freeman, "Labelme: A Database and Web-Based Tool for Image Annotation," Int'l J. Computer Vision, vol. 77, pp. 157-173, 2008.
[12] C. Cartis, N.I. Gould, and P.L. Toint, "On the Evaluation Complexity of Composite Function Minimization with Applications to Nonconvex Nonlinear Programming," SIAM J. Optimization, vol. 21, pp. 1721-1739, 2011.
[13] C. Desai, D. Ramanan, and C.C. Fowlkes, "Discriminative Models for Multi-Class Object Layout," Int'l J. Computer Vision, vol. 95, pp. 1-12, 2011.
[14] C. Haruechaiyasak and C. Damrongrat, "Improving Social Tag-Based Image Retrieval with CBIR Technique," Proc. Role of Digital Libraries in a Time of Global Change, and 12th Int'l Conf. Asia-Pacific Digital Libraries, pp. 212-215, 2010.
[15] C. Liu, J. Yuen, and A. Torralba, "Nonparametric Scene Parsing via Label Transfer," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 33, no. 12, pp. 2368-2382, Dec. 2011.
[16] D.G. Lowe, "Object Recognition from Local Scale-Invariant Features," Proc. Seventh IEEE Int'l Conf. Computer Vision, 1999.
[17] D. Metzler and R. Manmatha, "An Inference Network Approach to Image Retrieval," Proc. Int'l Conf. Image and Video Retrieval, pp. 42-50, 2004.
[18] E. Akbas and F.T.Y. Vural, "Automatic Image Annotation by Ensemble of Visual Descriptors," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1-8, 2007.
[19] F. Monay and D. Gatica-Perez, "PLSA-Based Image Auto-Annotation: Constraining the Latent Space," Proc. 12th Ann. ACM Int'l Conf. Multimedia, pp. 348-351, 2004.
[20] G. Carneiro, A.B. Chan, P.J. Moreno, and N. Vasconcelos, "Supervised Learning of Semantic Classes for Image Annotation and Retrieval," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 3, pp. 394-410, Mar. 2007.
[21] G. Chen, J. Zhang, F. Wang, C. Zhang, and Y. Gao, "Efficient Multi-Label Classification with Hypergraph Regularization," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1658-1665, 2009.
[22] G. Csurka, C.R. Dance, L. Fan, J. Willamowski, and C. Bray, "Visual Categorization with Bags of Keypoints," Proc. IEEE Workshop Statistical Learning in Computer Vision, pp. 1-22, 2004.
[23] G. Wang, D. Hoiem, and D.A. Forsyth, "Learning Image Similarity from Flickr Groups Using Stochastic Intersection Kernel Machines," Proc. 12th IEEE Int'l Conf. Computer Vision, pp. 428-435, 2009.
[24] G. Zhu, S. Yan, and Y. Ma, "Image Tag Refinement towards Low-Rank, Content-Tag Prior and Error Sparsity," Proc. ACM Int'l Conf. Multimedia, pp. 461-470, 2010.
[25] H. Halpin, V. Robu, and H. Shepherd, "The Complex Dynamics of Collaborative Tagging," Proc. 16th Int'l Conf. World Wide Web, pp. 211-220, 2007.
[26] H. Wang and J. Hu, "Multi-Label Image Annotation via Maximum Consistency," Proc. 17th IEEE Int'l Conf. Image Processing, pp. 2337-2340, 2010.
[27] J. Li and J.Z. Wang, "Real-Time Computerized Annotation of Pictures," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 6, pp. 985-1002, June 2008.
[28] J. Zhuang and S.C. Hoi, "A Two-View Learning Approach for Image Tag Ranking," Proc. Fourth ACM Int'l Conf. Web Search and Data Mining, pp. 625-634, 2011.
[29] J. Zobel and A. Moffat, "Inverted Files for Text Search Engines," ACM Computing Surveys, vol. 38, 2006.
[30] K. Barnard, P. Duygulu, D. Forsyth, N. de Freitas, D. Blei, and M. Jordan, "Matching Words and Pictures," J. Machine Learning Research, vol. 3, pp. 1107-1135, 2003.
[31] K.-S. Goh, E.Y. Chang, and B. Li, "Using One-Class and Two-Class Svms for Multiclass Image Annotation," IEEE Trans. Knowledge and Data Eng., vol. 17, no. 10, pp. 1333-1346, Oct. 2005.
[32] L. Chen, D. Xu, I.W. Tsang, and J. Luo, "Tag-Based Web Photo Retrieval Improved by Batch Mode Re-Tagging," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 3440-3446, 2010.
[33] L. Wu, S.C. Hoi, J. Zhu, R. Jin, and N. Yu, "Distance Metric Learning from Uncertain Side Information with Application to Automated Photo Tagging," Proc. 17th ACM Int'l Conf. Multimedia, 2009.
[34] L. Wu, M. Li, Z. Li, W.-Y. Ma, and N. Yu, "Visual Language Modeling for Image Classification," Proc. Int'l Workshop Multimedia Information Retrieval, pp. 115-124, 2007.
[35] L. Wu, L. Yang, N. Yu, and X.-S. Hua, "Learning to Tag," Proc. 18th Int'l Conf. World Wide Web, pp. 361-361, 2009.
[36] L. Yang, R. Jin, L. Mummert, R. Sukthankar, A. Goode, B. Zheng, S.C.H. Hoi, and M. Satyanarayanan, "A Boosting Framework for Visuality-Preserving Distance Metric Learning and Its Application to Medical Image Retrieval," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 32, no. 1, pp. 30-44, Jan. 2010.
[37] M.E.I. Kipp and G.D. Campbell, "Patterns and Inconsistencies in Collaborative Tagging Systems: An Examination of Tagging Practices," Ann. General Meeting Am. Soc. for Information Science and Technology, vol. 43, pp. 1-18, 2006.
[38] M. Guillaumin, T. Mensink, J. Verbeek, and C. Schmid, "TagProp: Discriminative Metric Learning in Nearest Neighbor Models for Image Auto-Annotation," Proc. 12th IEEE Int'l Conf. Computer Vision, pp. 309-316, 2009.
[39] M.H. Wright, "The Interior-Point Revolution in Optimization: History, Recent Developments, and Lasting Consequences," Bull. Am. Math. Soc., vol. 42, pp. 39-56, 2005.
[40] M.-L. Zhang and Z.-H. Zhou, "ML-LNN: A Lazy Learning Approach to Multi-Label Learning," Pattern Recognition, vol. 40, pp. 2038-2048, 2007.
[41] M.S. Lew, "Content-Based Multimedia Information Retrieval: State of the Art and Challenges," ACM Trans. Multimedia Computing, Comm. and Applications, vol. 2, pp. 1-19, 2006.
[42] N. Zhou, W.K. Cheung, G. Qiu, and X. Xue, "A Hybrid Probabilistic Model for Unified Collaborative and Content-Based Image Tagging," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 33, no. 7 pp. 1281-1294, July 2011.
[43] P. Duygulu, K. Barnard, J.F.G. de Freitas, and D.A. Forsyth, "Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary," Proc. Seventh European Conf. Computer Vision, pp. 97-112, 2002.
[44] P. Heymann, G. Koutrika, and H. Garcia-Molina, "Can Social Bookmarking Improve Web Search?" Proc. Int'l Conf. Web Search and Web Data Mining, pp. 195-206, 2008.
[45] P. Tirilly, V. Claveau, and P. Gros, "Language Modeling for Bag-of-Visual Words Image Categorization," Proc. Int'l Conf. Content-Based Image and Video Retrieval, pp. 249-258, 2008.
[46] P. Wu, S.C.-H. Hoi, P. Zhao, and Y. He, "Mining Social Images with Distance Metric Learning for Automated Image Tagging," Proc. Fourth ACM Int'l Conf. Web Search and Data Mining, pp. 197-206, 2011.
[47] R. Datta, W. Ge, J. Li, and J.Z. Wang, "Toward Bridging the Annotation-Retrieval Gap in Image Search by a Generative Modeling Approach," Proc. 14th Ann. ACM Int'l Conf. Multimedia, pp. 977-986, 2006.
[48] R. Jin, J.Y. Chai, and L. Si, "Effective Automatic Image Annotation via a Coherent Language Model and Active Learning," Proc. 12th Ann. ACM Int'l Conf. Multimedia, pp. 892-899, 2004.
[49] S.C.H. Hoi, W. Liu, M.R. Lyu, and W.-Y. Ma, "Learning Distance Metrics with Contextual Constraints for Image Retrieval," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 2072-2078, 2006.
[50] S.L. Feng, R. Manmatha, and V. Lavrenko, "Multiple Bernoulli Relevance Models for Image and Video Annotation," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1002-1009, 2004.
[51] S. Shalev-Shwartz and N. Srebro, "SVM Optimization: Inverse Dependence on Training Set Size," Proc. 25th Int'l Conf. Machine Learning, pp. 928-935, 2008.
[52] V. Lavrenko, R. Manmatha, and J. Jeon, "A Model for Learning the Semantics of Pictures," Proc. Advances in Neural Information Processing Systems Conf., 2003.
[53] X. Li, C.G.M. Snoek, and M. Worring, "Learning Social Tag Relevance by Neighbor Voting," IEEE Trans. Multimedia, vol. 11, no. 7, pp. 1310-1322, Nov. 2009.
[54] X.-J. Wang, L. Zhang, F. Jing, and W.-Y. Ma, "Annosearch: Image Auto-Annotation by Search," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1483-1490, 2006.
[55] Y. Guo and S. Gu, "Multi-Label Classification Using Conditional Dependency Networks," Proc. 22nd Int'l Joint Conf. Artificial Intelligence, pp. 1300-1305, 2011.
[56] Y.-G. Jiang, C.-W. Ngo, and J. Yang, "Towards Optimal Bag-of-Features for Object Categorization and Semantic Video Retrieval," Proc. Sixth ACM Int'l Conf. Image and Video Retrieval, pp. 494-501, 2007.
[57] Y. Ke and R. Sukthankar, "PCA-Sift: A More Distinctive Representation for Local Image Descriptors," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 506-513, 2004.
[58] Y. Liu, D. Xu, I.W. Tsang, and J. Luo, "Textual Query of Personal Photos Facilitated by Large-Scale Web Data," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 33, no. 5 pp. 1022-1036, May 2011.
[59] Y. Liu, R. Jin, and L. Yang, "Semi-Supervised Multi-Label Learning by Constrained Non-Negative Matrix Factorization," Proc. 21st Nat'l Conf. Artificial Intelligence, pp. 421-426, 2006.
[60] Z.-J. Zha, T. Mei, J. Wang, Z. Wang, and X.-S. Hua, "Graph-Based Semi-Supervised Learning with Multiple Labels," J. Visual Comm. and Image Representation, vol. 20, pp. 97-103, 2009.
[61] Z. Wu, Q. Ke, J. Sun, and H.-Y. Shum, "Scalable Face Image Retrieval with Identity-Based Quantization and Multireference Reranking," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 33, no. 10 pp. 1991-2001, Oct. 2011.
34 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool