This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A Multimedia Retrieval Framework Based on Semi-Supervised Ranking and Relevance Feedback
April 2012 (vol. 34 no. 4)
pp. 723-742
Dong Xu, Sch. of Comput. Eng., Nanyang Technol. Univ., Singapore, Singapore
Feiping Nie, Dept. of Comput. Sci. & Eng., Univ. of Texas at Arlington, Arlington, TX, USA
Yi Yang, Coll. of Comput. Sci., Zhejiang Univ., Hangzhou, China
Jiebo Luo, Kodak Res. Labs., Eastman Kodak Co., Rochester, NY, USA
Yueting Zhuang, Coll. of Comput. Sci., Zhejiang Univ., Hangzhou, China
Yunhe Pan, Coll. of Comput. Sci., Zhejiang Univ., Hangzhou, China
We present a new framework for multimedia content analysis and retrieval which consists of two independent algorithms. First, we propose a new semi-supervised algorithm called ranking with Local Regression and Global Alignment (LRGA) to learn a robust Laplacian matrix for data ranking. In LRGA, for each data point, a local linear regression model is used to predict the ranking scores of its neighboring points. A unified objective function is then proposed to globally align the local models from all the data points so that an optimal ranking score can be assigned to each data point. Second, we propose a semi-supervised long-term Relevance Feedback (RF) algorithm to refine the multimedia data representation. The proposed long-term RF algorithm utilizes both the multimedia data distribution in multimedia feature space and the history RF information provided by users. A trace ratio optimization problem is then formulated and solved by an efficient algorithm. The algorithms have been applied to several content-based multimedia retrieval applications, including cross-media retrieval, image retrieval, and 3D motion/pose data retrieval. Comprehensive experiments on four data sets have demonstrated its advantages in precision, robustness, scalability, and computational efficiency.

[1] http://vision.cs.brown.eduhumaneva/, 2011.
[2] X. Bai, X. Yang, L. Latecki, W. Liu, and Z. Tu, "Learning Context-Sensitive Shape Similarity by Graph Transduction," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 32, no. 5, pp. 861-874, May 2010.
[3] P. Belhumeur, J. Hespanha, and D. Kriegman, "Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 711-720, July 1997.
[4] A.D. Bimbo and P. Pala, "Content Based Retrieval of 3D Models," ACM Trans. Multimedia Computing, Comm. and Applications, vol. 2, no. 1, pp. 20-43, 2006.
[5] L. Bottou and V. Vapnik, "Local Learning Algorithms," Neural Computation, vol. 4, no. 6, pp. 888-900, 1992.
[6] D. Cai, X. He, and J. Han, "Semi-Supervised Discriminant Analysis," Proc. IEEE Int'l Conf. Computer Vision, 2007.
[7] D. Cai, X. He, and J. Han, "Training Linear Discriminant Analysis in Linear Time," Proc. IEEE Int'l Conf. Data Eng., 2008.
[8] G. Chechik, V. Sharma, U. Shalit, and S. Bengio, "Large Scale Online Learning of Image Similarity through Ranking," Proc. Iberian Conf. Pattern Recognition and Image Analysis, 2009.
[9] T. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y. Zheng, "NUS-WIDE: A Real-World Web Image Database from National University of Singapore," Proc. ACM Int'l Conf. Image and Video Retrieval, 2009.
[10] F. Chung, Spectral Graph Theory. AMS Bookstore, 1997.
[11] P. Duygulu, K. Barnard, J. Freitas, and D. Forsyth, "Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary," Proc. European Conf. Computer Vision, 2002.
[12] J. Fan, A. Elmagarmid, X. Zhu, W. Aref, and L. Wu, "Classview: Hierarchical Video Shot Classification, Indexing, and Accessing," IEEE Trans. Multimedia, vol. 6, no. 1, pp. 70-86, Feb. 2004.
[13] A. Frome, Y. Singer, F. Sha, and J. Malik, "Learning Globally-Consistent Local Distance Functions for Shape-Based Image Retrieval and Classification," Proc. IEEE Int'l Conf. Computer Vision, 2007.
[14] K. Fukunaga, Introduction to Statistical Pattern Recognition, second ed. Academic Press, 1991.
[15] J. He, M. Li, H. Zhang, H. Tong, and C. Zhang, "Manifold-Ranking Based Image Retrieval," Proc. ACM Int'l Conf. Multimedia, pp. 9-16, 2004.
[16] X. He, W.-Y. Ma, and H.-J. Zhang, "Learning an Image Manifold for Retrieval," Proc. ACM Int'l Conf. Multimedia, pp. 17-23, 2004.
[17] J. Huang, S. Kumar, M. Mitra, W. Zhu, and R. Zabih, "Image Indexing Using Color Correlograms," Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, pp. 762-768, 1997.
[18] Y. Jia, F. Nie, and C. Zhang, "Trace Ratio Problem Revisited," IEEE Trans. Neural Networks, vol. 20, no. 4, pp. 729-735, Apr. 2009.
[19] A. Langville and C. Meyer, "Survey: Deeper Inside Pagerank," Internet Math. vol. 1, no. 3, pp. 335-380, 2003.
[20] M. Lew, N. Sebe, C. Djeraba, and R. Jain, "Content Based Multimedia Information Retrieval: State of the Art and Challenges," ACM Trans. Multimedia Computing, Comm. and Applications, vol. 2, no. 1, pp. 1-19, 2006.
[21] N. Maddage, C. Xu, M. Kankanhalli, and X. Shao, "Content Based Music Structure Analysis with Applications to Music Semantics Understanding," Proc. ACM Int'l Conf. Multimedia, pp. 112-119, 2004.
[22] M. Müller, T. Röder, and M. Clausen, "Efficient Content Based Retrieval of Motion Capture Data," ACM Trans. Graphics, vol. 24, no. 3, pp. 677-685, 2005.
[23] F. Nie, S. Xiang, Y. Jia, C. Zhang, and S. Yan, "Trace Ratio Criterion for Feature Selection," Proc. Nat'l Conf. Artificial Intelligence, 2008.
[24] S. Roweis and L.K. Saul, "Nonlinear Dimensionality Reduction by Locally Linear Embedding," Science, vol. 290, pp. 2323-2326, 2000.
[25] Y. Rubner, C. Tomasi, and L. Guibas, "A Metric for Distributions with Applications to Image Databases," Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, 1998.
[26] Y. Rui and T.S. Huang, "Optimizing Learning in Image Retrieval," Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, 2000.
[27] D. Spielman and S. Teng, "Nearly-Linear Time Algorithms for Graph Partitioning, Graph Sparsification, and Solving Linear Systems," Proc. ACM Symp. Theory of Computing, pp. 81-90, 2004.
[28] K. Tieu and P. Viola, "Boosting Image Retrieval," Int'l J. Computer Vision, vol. 56, nos. 1/2, pp. 17-36, 2004.
[29] N. Vasconcelos and A. Lippman, "A Probabilistic Architecture for Content Based Image Retrieval," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2000.
[30] F. Wang and C. Zhang, "Label Propagation through Linear Neighborhoods," IEEE Trans. Knowledge and Data Eng., vol. 20, no. 1, pp. 55-67, Jan. 2008.
[31] H. Wang, S. Yan, D. Xu, X. Tang, and T. Huang, "Trace Ratio vs. Ratio Trace for Dimensionality Reduction," Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, 2007.
[32] M. Wu and B. Schölkopf, "Transductive Classification via Local Learning Regularization," Proc. Int'l Conf. Artificial Intelligence and Statistics, 2007.
[33] Y. Yang, D. Xu, F. Nie, S. Yan, and Y. Zhuang, "Image Clustering Using Local Discriminant Models and Global Integration," IEEE Trans. Image Processing, vol. 19, no. 10, pp. 2761-2773, Oct. 2010.
[34] Y. Yang, D. Xu, F. Nie, J. Luo, and Y. Zhuang, "Ranking with Local Regression and Global Alignment for Cross Media Retrieval," Proc. ACM Int'l Conf. Multimedia, 2009.
[35] Y. Yang, Y. Zhuang, D. Xu, Y. Pan, D. Tao, and S. Maybank, "Retrieval Based Interactive Cartoon Synthesis via Unsupervised Bi-Distance Metric Learning," Proc. ACM Int'l Conf. Multimedia, 2009.
[36] Y. Yang, Y. Zhuang, F. Wu, and Y. Pan, "Harmonizing Hierarchical Manifolds for Multimedia Document Semantics Understanding and Cross-Media Retrieval," IEEE Trans. Multimedia, vol. 10, no. 3, pp. 437-446, Apr. 2008.
[37] J. Yu and Q. Tian, "Learning Image Manifolds by Semantic Subspace Projection," Proc. Ann. ACM Int'l Conf. Multimedia, 2006.
[38] L. Zhang, F. Lin, and B. Zhang, "Support Vector Machine Learning for Image Retrieval," Proc. Int'l Conf. Image Processing, 2001.
[39] L. Zhang, C. Chen, W. Chen, J. Bu, D. Cai, and X. He, "Convex Experimental Design Using Manifold Structure for Image Retrieval," Proc. ACM Int'l Conf. Multimedia, 2009.
[40] R. Zhang and Z. Zhang, "Effective Image Retrieval Based on Hidden Concept Discovery in Image Database," IEEE Trans. Image Processing, vol. 16, no. 2, pp. 562-572, Feb. 2007.
[41] Z. Zhang and H. Zha, "Nonlinear Dimension Reduction via Local Tangent Space Alignment," Proc. Int'l Conf. Intelligent Data Eng. and Automated Learning, pp. 477-481, 2003.
[42] D. Zhou and B. Schölkopf, "A Regularization Framework for Learning from Graph Data," Proc. ICML Workshop Statistical Relational Learning, pp. 132-137, 2004.
[43] D. Zhou, J. Weston, A. Gretton, O. Bousquet, and B. Schölkopf, "Ranking on Data Manifolds," Proc. Advances in Neural Information Processing Systems, 2003.
[44] D. Zhou, O. Bousquet, T. Lal, J. Weston, and B. Schölkopf, "Learning with Local and Global Consistency," Proc. Advances in Neural Information Processing Systems, 2003.
[45] X. Zhu, "Semi-Supervised Learning Literature Survey," technical report, Univ. of Wisconsin, Madison, 2008.
[46] Y. Zhuang, Y. Yang, and F. Wu, "Mining Semantic Correlation of Heterogeneous Multimedia Data for Cross-Media Retrieval," IEEE Trans. Multimedia, vol. 10, no. 2, pp. 221-229, Feb. 2008.

Index Terms:
relevance feedback,content-based retrieval,data structures,image retrieval,Laplace equations,learning (artificial intelligence),matrix algebra,multimedia computing,regression analysis,pose data retrieval,semisupervised ranking,multimedia content analysis,multimedia content retrieval,semisupervised algorithm,local regression,global alignment,learning,Laplacian matrix,data ranking,local linear regression model,ranking score prediction,unified objective function,semisupervised long-term relevance feedback algorithm,multimedia data representation,multimedia data distribution,multimedia feature space,trace ratio optimization problem,content-based multimedia retrieval applications,cross-media retrieval,image retrieval,3D motion data retrieval,Multimedia communication,Radio frequency,Algorithm design and analysis,Multimedia databases,Image retrieval,Data models,Manifolds,3D motion data retrieval.,Content-based multimedia retrieval,semi-supervised learning,ranking algorithm,relevance feedback,cross-media retrieval,image retrieval
Citation:
Dong Xu, Feiping Nie, Yi Yang, Jiebo Luo, Yueting Zhuang, Yunhe Pan, "A Multimedia Retrieval Framework Based on Semi-Supervised Ranking and Relevance Feedback," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 4, pp. 723-742, April 2012, doi:10.1109/TPAMI.2011.170
Usage of this product signifies your acceptance of the Terms of Use.