The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.03 - July-Sept. (2013 vol.20)
pp: 47-57
Guan-Long Wu , National Taiwan University
Yin-Hsi Kuo , National Taiwan University
Tzu-Hsuan Chiu , National Taiwan University
Winston H. Hsu , National Taiwan University
Lexing Xie , Australian National University
ABSTRACT
Retrieving relevant videos from a large corpus on mobile devices is a vital challenge. This article addresses two key issues for mobile search on user-generated videos. The first is the lack of good relevance measurement for learning semantically rich representations, due to the unconstrained nature of online videos. The second is the limited resources on mobile devices, stringent bandwidth, and delay requirement between the device and video server. The authors propose a knowledge-embedded sparse projection learning approach. To alleviate the need for expensive annotation in hash learning, they investigate varying approaches for pseudo label mining, where explicit semantic analysis leverages Wikipedia. In addition, they propose a novel sparse projection method to address the efficiency challenge by learning a discriminative compact representation that drastically reduces transmission costs. With less than 10 percent nonzero elements in the projection matrix, it also reduces computational and storage costs. The experimental results on 100,000 videos show that the proposed algorithm yields performance competitive with the prior state-of-the-art hashing methods, which are not applicable for mobiles and solely rely on costly manual annotations. The average query time for 100,000 videos was only 0.592 seconds.
INDEX TERMS
Semantics, Mobile communication, Sparse matrices, Mobile handsets, Encyclopedias, Electronic publishing, mobile video retrieval, Semantics, Mobile communication, Sparse matrices, Mobile handsets, Encyclopedias, Electronic publishing, explicit semantic analysis, multimedia, multimedia applications, content-based video search, hashing, sparsity
CITATION
Guan-Long Wu, Yin-Hsi Kuo, Tzu-Hsuan Chiu, Winston H. Hsu, Lexing Xie, "Scalable Mobile Video Retrieval with Sparse Projection Learning and Pseudo Label Mining", IEEE MultiMedia, vol.20, no. 3, pp. 47-57, July-Sept. 2013, doi:10.1109/MMUL.2013.13
REFERENCES
1. B. Girod et al., "Mobile Visual Search," IEEE Signal Processing Magazine, vol. 28, no. 4, 2011, pp. 61–76.
2. J. He et al., "Mobile Product Search with Bag of Hash Bits," Proc. 19th ACM Int'l Conf. Multimedia (MM), ACM, 2011, pp. 863–840.
3. H. Jégou et al., "Aggregating Local Descriptors into a Compact Image Representation," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), IEEE CS, 2010, pp. 3304–3310.
4. J. Wang, S. Kumar, and S.-F. Chang, "Sequential Projection Learning for Hashing with Compact Codes," Proc. 27th Int'l Conf. Machine Learning (ICML), Omnipress, 2010, pp. 1127–1134.
5. D. Achlioptas, "Database-Friendly Random Projections: Johnson-Lindenstrauss with Binary Coins," J. Computer and System Sciences, vol. 66, no. 4, 2003, pp. 671–687.
6. M. Journée et al., "Generalized Power Method for Sparse Principal Component Analysis," J. Machine Learning Research, vol. 11, Mar. 2010, pp. 517–553.
7. M. Sahami et al., "A Web-Based Kernel Function for Measuring the Similarity of Short Text Snippets," Proc. 15th ACM Int'l Conf. World Wide Web (WWW), ACM, 2006, pp. 377–386.
8. E. Gabrilovich et al., "Computing Semantic Relatedness Using Wikipedia-Based Explicit Semantic Analysis," Proc. 20th Int'l Joint Conf. Artificial Intelligence (IJCAI), Morgan Kaufman, 2007, pp. 1606–1611.
9. R. Hong et al., "Exploring Large Scale Data for Multimedia QA: An Initial Study," Proc. ACM Int'l Conf. Image and Video Retrieval (CIVR), ACM, 2010, pp. 74–81.
10. L. Xie et al., "Visual Memes in Social Media: Tracking Real-World News in YouTube Videos," Proc. 19th ACM Int'l Conf. Multimedia (MM), ACM, 2011, pp. 53–62.
11. J. Song et al., "Multiple Feature Hashing for Real-Time Large Scale Near-Duplicate Video Retrieval," Proc. 19th ACM Int'l Conf. Multimedia (MM), ACM, 2011, pp. 423–432.
12. Y.-C. Su et al., "Evaluating Gaussian Like Image Representations over Local Features," Proc. IEEE Int'l Conf. Multimedia and Expo (ICME), IEEE CS, 2012, pp. 979–984.
36 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool