The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.07 - July (2011 vol.33)
pp: 1281-1294
Ning Zhou , University of North Carolina, Charlotte
William K. Cheung , Hong Kong Baptist University, Hong Kong
Guoping Qiu , University of Nottingham, Nottingham
Xiangyang Xue , Fudan University, Shanghai
ABSTRACT
The increasing availability of large quantities of user contributed images with labels has provided opportunities to develop automatic tools to tag images to facilitate image search and retrieval. In this paper, we present a novel hybrid probabilistic model (HPM) which integrates low-level image features and high-level user provided tags to automatically tag images. For images without any tags, HPM predicts new tags based solely on the low-level image features. For images with user provided tags, HPM jointly exploits both the image features and the tags in a unified probabilistic framework to recommend additional tags to label the images. The HPM framework makes use of the tag-image association matrix (TIAM). However, since the number of images is usually very large and user-provided tags are diverse, TIAM is very sparse, thus making it difficult to reliably estimate tag-to-tag co-occurrence probabilities. We developed a collaborative filtering method based on nonnegative matrix factorization (NMF) for tackling this data sparsity issue. Also, an L_1 norm kernel method is used to estimate the correlations between image features and semantic concepts. The effectiveness of the proposed approach has been evaluated using three databases containing 5,000 images with 371 tags, 31,695 images with 5,587 tags, and 269,648 images with 5,018 tags, respectively.
INDEX TERMS
Automatic image tagging, collaborative filtering, feature integration, nonnegative matrix factorization, kernel density estimation.
CITATION
Ning Zhou, William K. Cheung, Guoping Qiu, Xiangyang Xue, "A Hybrid Probabilistic Model for Unified Collaborative and Content-Based Image Tagging", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.33, no. 7, pp. 1281-1294, July 2011, doi:10.1109/TPAMI.2010.204
REFERENCES
[1] R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval. Addison Wesley, May 1999.
[2] K. Barnard, P. Duygulu, D.A. Forsyth, N. de Freitas, D.M. Blei, and M.I. Jordan, “Matching Words and Pictures,” J. Machine Learning Research, vol. 3, pp. 1107-1135, 2003.
[3] K. Barnard and D.A. Forsyth, “Learning the Semantics of Words and Pictures,” Proc. IEEE Int'l Conf. Computer Vision, pp. 408-415, 2001.
[4] J. Basilico and T. Hofmann, “Unifying Collaborative and Content-Based Filtering,” Proc. 21st Int'l Conf. Machine Learning, 2004.
[5] J.L. Bentley, “Multidimensional Binary Search Trees Used for Associative Searching,” Comm. ACM, vol. 18, pp. 509-517, Sept. 1975.
[6] D.M. Blei and M.I. Jordan, “Modeling Annotated Data,” Proc. ACM SIGIR, pp. 127-134, 2003.
[7] J.S. Breese, D. Heckerman, and C. Kadie, “Empirical Analysis of Predictive Algorithms for Collaborative Filtering,” Proc. 14th Conf. Uncertainty in Artificial Intelligence, pp. 43-52, 1998.
[8] G. Carneiro, A.B. Chan, P.J. Moreno, and N. Vasconcelos, “Supervised Learning of Semantic Classes for Image Annotation and Retrieval,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 3, pp. 394-410, Mar. 2007.
[9] A.B. Chan, P.J. Moreno, and N. Vasconcelos, “Using Statistics to Search and Annotate Pictures: An Evaluation of Semantic Image Annotation and Retrieval on Large Databases,” Proc. Am. Statistical Assoc., Aug. 2006.
[10] T.-S. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y.-T. Zheng, “NUS-Wide: A Real-World Web Image Database from National University of Singapore,” Proc. ACM Conf. Image and Video Retrieval, July 2009.
[11] R. Cilibrasi and P.M.B. Vitányi, “The Google Similarity Distance,” IEEE Trans. Knowledge Data Eng., vol. 19, no. 3, pp. 370-383, Mar. 2007.
[12] R. Datta, D. Joshi, J. Li, and J.Z. Wang, “Image Retrieval: Ideas, Influences, and Trends of the New Age,” ACM Computing Surveys, vol. 40, no. 2, pp. 1-60, 2008.
[13] R. Datta, D. Joshi, J. Li, and J.Z. Wang, “Tagging over Time: Real-World Image Annotation by Lightweight Meta-Learning,” Proc. 15th ACM Int'l Conf. Multimedia, pp. 393-402, 2007.
[14] S. Deerwester, S.T. Dumais, G.W. Furnas, T.K. Landauer, and R. Harshman, “Indexing by Latent Semantic Analysis,” J. Am. Soc. Information Science, vol. 41, pp. 391-407, 1990.
[15] M. Klaas, D. Lang, and N. de Freitas, “Empirical Testing of Fast Kernel Density Estimation Algorithms,” Technical Report UBC TR-2005-03, Computer Sciences Dept., The Univ. of British Columbia, Mar. 2005.
[16] P. Duygulu, K. Barnard, J. de Freitas, and D. Forsyth, “Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary,” Proc. Seventh European Conf. Computer Vision, pp. 349-354, 2002.
[17] J. Fan, Y. Gao, and H. Luo, “Hierarchical Classification for Automatic Image Annotation,” Proc. ACM SIGIR, pp. 111-118, 2007.
[18] S. Feng, V. Lavrenko, and R. Manmatha, “Multiple Bernoulli Relevance Models for Image and Video Annotation,” Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 1002-1009, 2004.
[19] S. Feng and R. Manmatha, “A Discrete Direct Retrieval Model for Image and Video Retrieval,” Proc. Int'l Conf. Content-Based Image and Video Retrieval, pp. 427-436, 2008.
[20] Flickr, http:/www.flickr.com, Yahoo!, 2005.
[21] J. Friedman, W. Stuetzle, and A. Schroeder, “Projection Pursuit Density Estimation,” J. Am. Statistical Assoc., vol. 79, pp. 599-608, 1984.
[22] E. Gaussier and C. Goutte, “Relation between PLSA and NMF and Implications,” Proc. ACM SIGIR '05, pp. 601-602, 2005.
[23] D. Goldberg, D. Nichols, B.M. Oki, and D.B. Terry, “Using Collaborative Filtering to Weave an Information Tapestry,” Comm. ACM, vol. 35, no. 12, pp. 61-70, 1992.
[24] L. Greengard and J. Strain, “The Fast Gauss Transform,” SIAM J. Scientific and Statistical Computing, vol. 2, no. 1, pp. 79-94, 1991.
[25] M. Guillaumin, T. Mensink, J. Verbeek, and C. Schmid, “Tagprop: Discriminative Metric Learning in Nearest Neighbor Models for Image Auto-Annotation” Proc. IEEE Int'l Conf. Computer Vision, pp. 309-316, Sept. 2009.
[26] C. Halaschek-Wiener, J. Golbeck, A. Schain, M. Grove, B. Parsia, and J. Hendler, “Photostuff—An Image Annotation Tool for the Semantic Web,” Proc. Fourth Int'l Semantic Web Conf., 2005.
[27] J.L. Herlocker, J.A. Konstan, A. Borchers, and J. Riedl, “An Algorithmic Framework for Performing Collaborative Filtering,” Proc. ACM SIGIR '99 , pp. 230-237, 1999.
[28] T. Hofmann, “Probabilistic Latent Semantic Analysis,” Proc. Uncertainty in Artificial Intelligence, pp. 289-296, 1999.
[29] J. Jeon, V. Lavrenko, and R. Manmatha, “Automatic Image Annotation and Retrieval Using Cross-Media Relevance Models,” Proc. ACM SIGIR '03, pp. 119-126, 2003.
[30] R. Jin, J.Y. Chai, and L. Si, “Effective Automatic Image Annotation via a Coherent Language Model and Active Learning,” Proc. 12th Ann. ACM Int'l Conf. Multimedia, pp. 892-899, 2004.
[31] Y. Jin, L. Khan, L. Wang, and M. Awad, “Image Annotations by Combining Multiple Evidence & Wordnet,” Proc. 13th Ann. ACM Int'l Conf. Multimedia, pp. 706-715, Nov. 2005.
[32] V. Lavrenko, R. Manmatha, and J. Jeon, “A Model for Learning the Semantics of Pictures,” Advances in Neural Information Processing Systems, MIT Press, 2003.
[33] D.D. Lee and H.S. Seung, “Learning the Parts of Objects by Nonnegative Matrix Factorization,” Nature, vol. 401, pp. 788-791, 1999.
[34] D.D. Lee and H.S. Seung, “Algorithms for Non-Negative Matrix Factorization,” Proc. Neural Information Processing Systems, pp. 556-562, 2000.
[35] F.-F. Li and P. Perona, “A Bayesian Hierarchical Model for Learning Natural Scene Categories,” Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, pp. 524-531, 2005.
[36] J. Li and J. Ze Wang, “Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 9, pp. 1075-1088, Sept. 2003.
[37] J. Li and J. Ze Wang, “Real-Time Computerized Annotation of Pictures,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 6, pp. 985-1002, June 2008.
[38] C.-J. Lin, “Projected Gradient Methods for Nonnegative Matrix Factorization,” Neural Computation, vol. 19, no. 10, pp. 2756-2779, 2007.
[39] J. Liu, B. Wang, M. Li, Z. Li, W.-Y. Ma, H. Lu, and S. Ma, “Dual Cross-Media Relevance Model for Image Annotation,” Proc. 15th ACM Int'l Conf. Multimedia, pp. 605-614, 2007.
[40] J. Liu, B. Wang, H. Lu, and S. Ma, “A Graph-Based Image Annotation Framework,” Pattern Recognition Letters, vol. 29, no. 4, pp. 407-415, 2008.
[41] W. Liu, S. Dumais, Y. Sun, H. Zhang, M. Czerwinski, and B. Field, “Semi-Automatic Image Annotation,” Proc. Eighth IFIP TC.13 Conf. Human-Computer Interaction, July 2001.
[42] N. Loeff and A. Farhadi, “Scene Discovery by Matrix Factorization,” Proc. European Conf. Computer Vision, vol. 4, pp. 451-464, 2008.
[43] A. Makadia, V. Pavlovic, and S. Kumar, “A New Baseline for Image Annotation,” Proc. European Conf. Computer Vision, vol. 3, pp. 316-329, 2008.
[44] I. Endres, N. Loeff, A. Farhadi, and D.A. Forsyth, “Unlabeled Data Improves Word Prediction,” Proc. IEEE Int'l Conf. Computer Vision, pp. 956-962, 2009.
[45] E. Parzen, “On Estimation of a Probability Density Function and Mode,” Annals of Math. Statistics, vol. 33, no. 3, pp. 1065-1076, 1962.
[46] G. Qiu, “Indexing Chromatic and Achromatic Patterns for Content-Based Colour Image Retrieval,” Pattern Recognition, vol. 35, pp. 1675-1685, Aug. 2002.
[47] A. Schein, A. Popescul, L. Ungar, and D. Pennock, “Methods and Metrics for Cold-Start Recommendations,” Proc. ACM SIGIR '02, pp. 253-260, 2002.
[48] B. Sigurbjörnsson and R. van Zwol, “Flickr Tag Recommendation Based on Collective Knowledge,” Proc. 17th Int'l Conf. World Wide Web, pp. 327-336, Apr. 2008.
[49] A. Smeulders, M. Worring, S. Santini, A. Gupta, R. Jain, “Content-Based Image Retrieval at the End of the Early Years,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 12, pp. 1349-1380, Dec. 2000.
[50] M. Szummer and R.W. Picard, “Indoor-Outdoor Image Classification,” Proc. IEEE Int'l. Workshop Content-Based Access of Image and Video Database, pp. 42-51, 1998.
[51] S. Uchihashi and T. Kanade, “Content-Free Image Retrieval by Combinations of Keywords and User Feedbacks,” Proc. Fourth Int'l Conf. Image and Video Retrieval, pp. 650-659, 2005.
[52] A. Vailaya, A.K. Jain, and H. Zhang, “On Image Classification: City Images versus Landscapes,” Pattern Recognition, vol. 31, no. 12, pp. 1921-1935, 1998.
[53] N. Vasconcelos and M. Kunt, “Content-Based Retrieval from Image Databases: Current Solutions and Future Directions,” Proc. IEEE Int'l Conf. Image Processing, vol. 3, pp. 6-9, 2001.
[54] Y. Wang and S. Gong, “Refining Image Annotation Using Contextual Relations between Words,” Proc. Sixth Int'l Conf. Image and Video Retrieval, pp. 425-432, 2007.
[55] S.M. Weiss and N. Indurkhya, “Lightweight Collaborative Filtering Method for Binary-Encoded Data,” Proc. Fifth European Conf. Principles of Data Mining and Knowledge Discovery, pp. 484-491, Sept. 2001.
[56] Y. Xiang, X. Zhou, T.-S. Chua, and C.-W. Ngo, “A Revisit of Generative Model for Automatic Image Annotation Using Markov Random Fields,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1153-1160, 2009.
[57] W. Xu, X. Liu, and Y. Gong, “Document Clustering Based on Non-Negative Matrix Factorization,” Proc. ACM SIGIR, pp. 267-273, 2003.
[58] C. Yang, R. Duraiswami, N.A. Gumerov, and L.S. Davis, “Improved Fast Gauss Transform and Efficient Kernel Density Estimation,” Proc. Ninth IEEE Int'l Conf. Computer Vision, pp. 464-471, 2003.
[59] A. Yavlinsky, E. Schofield, and S.M. Rüger, “Automated Image Annotation Using Global Features and Robust Nonparametric Density Estimation,” Proc. Fourth Int'l Conf. Image and Video Retrieval, pp. 507-517, 2005.
[60] C. Zhai and J.D. Lafferty, “A Study of Smoothing Methods for Language Models Applied to Information Retrieval,” ACM Trans. Information Systems, vol. 22, no. 2, pp. 179-214, 2004.
[61] S. Zhang, W. Wang, J. Ford, and F. Makedon, “Learning from Incomplete Ratings Using Non-Negative Matrix Factorization,” Proc. Sixth SIAM Int'l Conf. Data Mining, Apr. 2006.
[62] N. Zhou, W.K. Cheung, X. Xue, and G. Qiu, “Collaborative and Content-Based Image Labeling,” Proc. Int'l. Conf. Pattern Recognition, pp. 1-4, 2008.
18 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool