2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (2008)
Anchorage, AK, USA
June 23, 2008 to June 28, 2008
Grant Schindler , Georgia Institute of Technology, USA
Matthew Brown , University of British Columbia, Canada
Larry Zitnick , Microsoft Research, USA
In this paper, we examine the problem of internet video categorization. Specifically, we explore the representation of a video as a “bag of words” using various combinations of spatial and temporal descriptors. The descriptors incorporate both spatial and temporal gradients as well as optical flow information. We achieve state-of-the-art results on a standard human activity recognition database and demonstrate promising category recognition performance on two new databases of approximately 1000 and 1500 online user-submitted videos, which we will be making available to the community.
Grant Schindler, Matthew Brown, Larry Zitnick, "Internet video category recognition", 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, vol. 00, no. , pp. 1-7, 2008, doi:10.1109/CVPRW.2008.4562960