The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.03 - March (2011 vol.33)
pp: 618-630
Alexandre Karpenko , University of Toronto, Toronto
Parham Aarabi , University of Toronto, Toronto
ABSTRACT
In this paper, we present a large database of over 50,000 user-labeled videos collected from YouTube. We develop a compact representation called “tiny videos” that achieves high video compression rates while retaining the overall visual appearance of the video as it varies over time. We show that frame sampling using affinity propagation—an exemplar-based clustering algorithm—achieves the best trade-off between compression and video recall. We use this large collection of user-labeled videos in conjunction with simple data mining techniques to perform related video retrieval, as well as classification of images and video frames. The classification results achieved by tiny videos are compared with the tiny images framework [24] for a variety of recognition tasks. The tiny images data set consists of 80 million images collected from the Internet. These are the largest labeled research data sets of videos and images available to date. We show that tiny videos are better suited for classifying scenery and sports activities, while tiny images perform better at recognizing objects. Furthermore, we demonstrate that combining the tiny images and tiny videos data sets improves classification precision in a wider range of categories.
INDEX TERMS
Image classification, content-based retrieval, tiny videos, tiny images, data mining, nearest-neighbor methods.
CITATION
Alexandre Karpenko, Parham Aarabi, "Tiny Videos: A Large Data Set for Nonparametric Video Retrieval and Frame Classification", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.33, no. 3, pp. 618-630, March 2011, doi:10.1109/TPAMI.2010.118
REFERENCES
[1] YouTube's APIs and Developer Tools, http://code.google.com/apis/youtubeoverview.html , 2010.
[2] The Internet Archive, http:/www.archive.org, 2009.
[3] N. Dimitrova, T. McGee, and H. Elenbaas, "Video Keyframe Extraction and Filtering: A Keyframe Is Not a Keyframe to Everyone," Proc. Sixth Int'l Conf. Information and Knowledge Management, pp. 113-120, 1997.
[4] M. Douze, A. Gaidon, H. Jégou, M. Marszałek, and C. Schmid, "Inria-Lears Video Copy Detection System," Proc. Text Retrieval Conf. Video Retrieval Evaluation Workshop, http://lear.inrialpes.fr/pubs/2008DGJMS08a , Nov. 2008.
[5] S. Eickeler and S. Muller, "Content-Based Video Indexing of TV Broadcast News Using Hidden Markov Models," Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing, vol. 6, pp. 2997-3000, Mar. 1999.
[6] WordNet: An Electronic Lexical Database (Language, Speech, and Communication), C. Fellbaum, ed., MIT Press, http://www. amazon.ca/exec/obidos/redirect?tag=citeulike09-20&path= ASIN 026206197X , May 1998.
[7] B.J. Frey and D. Dueck, "Clustering by Passing Messages between Data Points," Science, vol. 315, pp. 972-976, 2007.
[8] G. Geisler and G. Marchionini, "The Open Video Project: A Research-Oriented Digital Video Repository," Proc. ACM Digital Libraries, pp. 258-259, http:/www.open-video.org, 2000.
[9] A. Hampapur, R. Jain, and T.E. Weymouth, "Production Model Based Digital Video Segmentation," Multimedia Tools Appl., vol. 1, no. 1, pp. 9-46, 1995.
[10] A. Joly, C. Frélicot, and O. Buisson, "Robust Content-Based Video Copy Identification in a Large Reference Database," Proc. Conf. Image and Video Retrieval, pp. 414-424, 2003.
[11] R. Junee, "Zoinks! 20 Hours of Video Uploaded Every Minute!" http://youtube-global.blogspot.com/2009/ 05zoinks-20-hours-of-video-upl oaded-every_20.html , May 2009.
[12] A. Karpenko and P. Aarabi, "Tiny Videos: Non-Parametric Content-Based Video Retrieval and Recognition," Proc. Tenth IEEE Int'l Symp. Multimedia, pp. 619-624, Dec. 2008.
[13] I. Laptev, "On Space-Time Interest Points," Int'l J. Computer Vision, vol. 64, nos. 2-3, pp. 107-123, 2005.
[14] J. Law-To, A. Joly, and N. Boujemaa, "Muscle-VCD-2007: A Live Benchmark for Video Copy Detection," http://www-rocq.inria. fr/imediacivr-bench /, 2007.
[15] J. Law-To, L. Chen, A. Joly, I. Laptev, O. Buisson, V. Gouet-Brunet, N. Boujemaa, and F. Stentiford, "Video Copy Detection: A Comparative Study," Proc. Sixth ACM Int'l Conf. Image and Video Retrieval, pp. 371-378, 2007.
[16] D.G. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints," Int'l J. Computer Vision, vol. 60, no. 2, pp. 91-110, 2004.
[17] S. Lu, M.R. Lyu, and I. King, "Semantic Video Summarization Using Mutual Reinforcement Principle and Shot Arrangement Patterns," Proc. 11th IEEE CS Int'l Multimedia Modelling Conf., pp. 60-67, 2005.
[18] D. Nistér and H. Stewénius, "Scalable Recognition with a Vocabulary Tree," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 2161-2168, 2006.
[19] K.A. Peker, A. Divakaran, and H. Sun, "Constant Pace Skimming and Temporal Sub-Sampling of Video Using Motion Activity," Proc. Int'l Conf. Image Processing, vol. 3, pp. 414-417, 2001.
[20] B. Shahraray, "Scene Change Detection and Content-Based Sampling of Video Sequences," Proc. SPIE Conf., pp. 2-13, Apr. 1995.
[21] A.F. Smeaton, P. Over, and W. Kraaij, "Evaluation Campaigns and Trecvid," Proc. Eighth ACM Int'l Workshop Multimedia Information Retrieval, pp. 321-330, 2006.
[22] N. Snavely, S.M. Seitz, and R. Szeliski, "Modeling the World from Internet Photo Collections," Int'l J. Computer Vision, vol. 80, no. 2,pp. 189-210, http:/phototour.cs.washington.edu/, Nov. 2008.
[23] C. Toklu, S.P. Liou, and M. Das, "Videoabstract: A Hybrid Approach to Generate Semantically Meaningful Video Summaries," Proc. IEEE Int'l Conf. Multimedia and Expo, vol. 3, pp. 1333-1336, 2000.
[24] A. Torralba, R. Fergus, and W.T. Freeman, "80 Million Tiny Images: A Large Data Set for Non-Parametric Object and Scene Recognition," Technical Report MIT-CSAIL-TR-2007-024, 2007.
[25] A. Torralba, R. Fergus, and Y. Weiss, "Small Codes and Large Image Databases for Recognition," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[26] R. Zabih, J. Miller, and K. Mai, "A Feature-Based Algorithm for Detecting and Classifying Scene Breaks," Proc. ACM Multimedia Conf., pp. 189-200, 1995.
[27] R. Zabih, J. Miller, and K. Mai, "A Feature-Based Algorithm for Detecting and Classifying Production Effects," Multimedia Systems, vol. 7, no. 2, pp. 119-128, 1999.
19 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool