The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.11 - Nov. (2012 vol.34)
pp: 2247-2258
Mohamed-Bécha Kaâniche , Higher Sch. of Commun. of Tunis (Sup'Com), Univ. of Carthage, El Ghazala, Tunisia
François Brémond , INRIA, Sophia Antipolis, France
ABSTRACT
We introduce a new gesture recognition framework based on learning local motion signatures (LMSs) of HOG descriptors introduced by [1]. Our main contribution is to propose a new probabilistic learning-classification scheme based on a reliable tracking of local features. After the generation of these LMSs computed on one individual by tracking Histograms of Oriented Gradient (HOG) [2] descriptor, we learn a codebook of video-words (i.e., clusters of LMSs) using k-means algorithm on a learning gesture video database. Then, the video-words are compacted to a code-book of codewords by the Maximization of Mutual Information (MMI) algorithm. At the final step, we compare the LMSs generated for a new gesture w.r.t. the learned code-book via the k-nearest neighbors (k-NN) algorithm and a novel voting strategy. Our main contribution is the handling of the N to N mapping between codewords and gesture labels within the proposed voting strategy. Experiments have been carried out on two public gesture databases: KTH [3] and IXMAS [4]. Results show that the proposed method outperforms recent state-of-the-art methods.
INDEX TERMS
Tracking, Equations, Vectors, Feature extraction, Kalman filters, Trajectory, Clustering algorithms, probabilistic learning and classification, Gesture recognition, motion detection, HOG descriptors, feature tracking
CITATION
Mohamed-Bécha Kaâniche, François Brémond, "Recognizing Gestures by Learning Local Motion Signatures of HOG Descriptors", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.34, no. 11, pp. 2247-2258, Nov. 2012, doi:10.1109/TPAMI.2012.19
REFERENCES
[1] M.B. Kaâniche and F. Brémond, "Tracking Hog Descriptors for Gesture Recognition," Proc. IEEE Int'l Conf. Advanced Video and Signal Based Surveillance, pp. 140-145, 2009.
[2] N. Dalal and B. Triggs, "Histograms of Oriented Gradients for Human Detection," Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 886-893, June 2005.
[3] C. Schuldt, I. Laptev, and B. Caputo, "Recognizing Human Actions: A Local SVM Approach," Proc. IEEE Int'l Conf. Pattern Recognition vol. 3, pp. 32-36, Aug. 2004.
[4] D. Weinland, E. Boyer, and R. Ronfard, "Action Recognition from Arbitrary Views Using 3D Examplars," Proc. IEEE Int'l Conf. Computer Vision, pp. 1-7, Oct. 2007.
[5] R. Munoz-Salinas, R. Medina-Carnicer, F.J. Madrid-Cuevas, and A. Carmona-Poyato, "Depth Silhouettes for Gesture Recognition," Pattern Recognition Letters, vol. 29, no. 3, pp. 319-329, 2008.
[6] C.W. Chu and I. COHEN, "Posture and Gesture Recognition Using 3D Body Shapes Decomposition," Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition: Workshops, vol. 3, p. 69. 2005.
[7] S. Calderara, R. Cucchiara, and A. Prati, "Action Signature: A Novel Holistic Representation for Action Recognition," Proc. IEEE Int'l Conf. Advanced Video and Signal Based Surveillance, pp. 121-128, 2008.
[8] J. Liu and M. Shah, "Learning Human Actions via Information Maximization," Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, pp. 1-8, June 2008.
[9] M.B. Kaâniche and F. Brémond, "Gesture Recognition by Learning Local Motion Signatures," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010.
[10] A. Yilmaz and M. Shah, "A Differential Geometric Approach to Representing the Human Actions," Computer Vision and Image Understanding, vol. 109, no. 3, pp. 335-351, 2008.
[11] W.L. Lu and J.J. Little, "Tracking and Recognizing Actions at a Distance," Proc. ECCV Workshop Computer Vision Based Analysis in Sport Environments, May 2006.
[12] W.L. Lu and J.J. Little, "Simultaneous Tracking and Action Recognition Using the PCA-HOG Descriptor," Proc. Third Canadian Conf. Computer and Robot Vision, pp. 6-13, June 2006.
[13] L. Gorelick, M. Blank, E. Shechtman, M. Irani, and R. Basri, "Actions as Space-Time Shapes," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 12, pp. 2247-2253, Dec. 2007.
[14] J.C. Niebles, H. Wang, and L. Fei-fei, "Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words," Proc. 17th British Machine Vision Conference, vol. 3, pp. 1249-1258, 2006.
[15] P. Scovanner, S. Ali, and M. Shah, "A 3-Dimensional Sift Descriptor and Its Application to Action Recognition," Proc. ACM Int'l Conf. Multimedia, pp. 357-360, 2008.
[16] Q. Luo, X. Kong, G. Zeng, and J. Fan, "Human Action Detection via Boosted Local Motion Histograms," Machine Vision and Applications, vol. 21, pp. 377-389, http://dx.doi.org/10.1007s00138-008-0168-5 , Nov. 2008.
[17] J. Sun, X. Wu, S. Yan, L.F. Cheong, T.S. Chua, and J. Li, "Hierarchical Spatio-Temporal Context Modeling for Action Recognition," Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, pp. 2004-2011, 2009.
[18] T. Brox, C. Bregler, and J. Malik, "Large Displacement Optical Flow," Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, pp. 41-48, 2009.
[19] K.K. Reddy, J. Liu, and M. Shah, "Incremental Action Recognition Using Feature-Tree," Proc. IEEE Int'l Conf. Computer Vision, 2009.
[20] R. Messing, C. Pal, and H. Kautz, "Activity Recognition Using the Velocity Histories of Tracked Keypoints," Proc. IEEE Int'l Conf. Computer Vision, 2009.
[21] P. Matikainen, M. Hebert, and R. Sukthankar, "Trajectons: Action Recognition through the Motion Analysis of Tracked Features," Proc. IEEE Int'l Conf. Computer Vision, 2009.
[22] M. Raptis and S. Soatto, "Tracklet Descriptors for Action Modeling and Video Analysis," Proc. European Conf. Computer Vision, pp. 577-590, 2010.
[23] A. Yao, J. Gall, and L. VanGool, "A Hough Transform-Based Voting Framework for Action Recognition," Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, 2010.
[24] D. McNeill, Hand and Mind: What Gestures Reveal about Thought. Univ. of Chicago Press, 1992.
[25] A.-T. Nghiem, F.c. Brémond, and M. Thonnat, "Controlling Background Subtraction Algorithms for Robust Object Detection," Proc. Third Int'l Conf. Imaging for Crime Detection and Prevention, 2009.
[26] J. Shi and C. Tomasi, "Good Features to Track," Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, pp. 593-600, June 1994.
[27] E. Rosten and T. Drummond, "Machine Learning for High-Speed Corner Detection," Proc. European Conf. Computer Vision, vol. 1, pp. 430-443, May 2006.
[28] D. Kim, J. Song, and D. Kim, "Simultaneous Gesture Segmentation and Recognition Based on Forward Spotting Accumulative Hmms," Pattern Recognition, vol. 40, no. 11, pp. 3012-3026, 2007.
[29] T.K. Kim, S.F. Wong, and R. Cipolla, "Tensor Canonical Correlation Analysis for Action Classification," Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, pp. 1-8, June 2007.
[30] F. Lv and R. Nevatia, "Single View Human Action Recognition Using Key Pose Matching and Viterbi Path Searching," Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, June 2007.
[31] B. Georgescu, I. Shimshoni, and P. Meer, "Mean Shift Based Clustering in High Dimensions: A Texture Classification Example," Proc. IEEE Int'l Conf. Computer Vision, vol. 1, pp. 456-463, 2003.
36 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool