The Community for Technology Leaders
2016 IEEE International Conference on Multimedia and Expo (ICME) (2016)
Seattle, WA, USA
July 11, 2016 to July 15, 2016
ISSN: 1945-788X
ISBN: 978-1-4673-7259-6
pp: 1-6
Zhenqi Xu , Beijing University of Posts and Telecommunications, No. 10, Xitu Cheng Road, Haidian District, Beijing, China, 100876
Jiani Hu , Beijing University of Posts and Telecommunications, No. 10, Xitu Cheng Road, Haidian District, Beijing, China, 100876
Weihong Deng , Beijing University of Posts and Telecommunications, No. 10, Xitu Cheng Road, Haidian District, Beijing, China, 100876
ABSTRACT
Video classification is more difficult than image classification since additional motion feature between image frames and amount of redundancy in videos should be taken into account. In this work, we proposed a new deep learning architecture called recurrent convolutional neural network (RCNN) which combines convolution operation and recurrent links for video classification tasks. Our architecture can extract the local and dense features from image frames as well as learning the temporal features between consecutive frames. We also explore the effectiveness of sequential sampling and random sampling when training our models, and find out that random sampling is necessary for video classification. The feature maps from our learned model preserve motion from image frames, which is analogous to the persistence of vision in human visual system. We achieved 81.0% classification accuracy without optical flow and 86.3% with optical flow on the UCF-101 dataset, both are competitive to the state-of-the-art methods.
INDEX TERMS
Training, Computer architecture, Convolution, Machine learning, Optical imaging, Redundancy, Neural networks
CITATION

Z. Xu, J. Hu and W. Deng, "Recurrent convolutional neural network for video classification," 2016 IEEE International Conference on Multimedia and Expo (ICME), Seattle, WA, USA, 2016, pp. 1-6.
doi:10.1109/ICME.2016.7552971
91 ms
(Ver 3.3 (11022016))