This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Real-Time Gesture Recognition by Learning and Selective Control of Visual Interest Points
March 2005 (vol. 27 no. 3)
pp. 351-364
For the real-time recognition of unspecified gestures by an arbitrary person, a comprehensive framework is presented that addresses two important problems in gesture recognition systems: selective attention and processing frame rate. To address the first problem, we propose the Quadruple Visual Interest Point Strategy. No assumptions are made with regard to scale or rotation of visual features, which are computed from dynamically changing regions of interest in a given image sequence. In this paper, each of the visual features is referred to as a visual interest point, to which a probability density function is assigned, and the selection is carried out. To address the second problem, we developed a selective control method to equip the recognition system with self-load monitoring and controlling functionality. Through evaluation experiments, we show that our approach provides robust recognition with respect to such factors as type of clothing, type of gesture, extent of motion trajectories, and individual differences in motion characteristics. In order to indicate the real-time performance and utility aspects of our approach, a gesture video system is developed that demonstrates full video-rate interaction with displayed image objects.

[1] T. Darrell and A.P. Pentland, “Attention-Driven Expression and Gesture Analysis in an Interactive Environment,” Proc. Int'l Workshop Automatic Face and Gesture Recognition, pp. 135-140, 1995.
[2] R.H.Y. So and M.J. Griffin, “Experimental Studies of the Use of Phase Lead Filters to Compensate Lags in Head-Coupled Visual Displays,” IEEE Trans. Systems, Man, and Cybernetics-Part A: Systems and Humans, vol. 26, no. 4, pp. 445-454, July 1996.
[3] P. Ekman, “Essential Behavioral Science of the Face and Gesture that Computer Scientists Need to Know,” Proc. Int'l Workshop Automatic Face and Gesture Recognition, pp. 7-11, 1995.
[4] T.S. Huang and V.I. Pavlovic, “Hand Gesture Modeling, Analysis, and Synthesis,” Proc. Int'l Workshop Automatic Face and Gesture Recognition, pp. 73-79, 1995.
[5] H. Thwaites, “Human Navigation in Real and Virtual Space,” Proc. Int'l Conf. Virtual Systems and Multimedia (VSMM '96), pp. 19-26, Sept. 1996.
[6] K. Mase, R. Kadobayashi, and R. Nakatsu, “Meta-Museum: A Supportive Augmented-Reality Environment for Knowledge Sharing,” Proc. Int'l Conf. Virtual Systems and Multimedia (VSMM '96), pp. 107-110, Sept. 1996.
[7] C. Maggioni, “Gesture Computer— New Ways of Operating a Computer,” Proc. Int'l Workshop Automatic Face and Gesture Recognition, pp. 166-171, 1995.
[8] M.J. Diorinos and H. Popp, “Conception and Development of Virtual Reality Shopping Malls,” Proc. Int'l Conf. Virtual Systems and Multimedia (VSMM '96), pp. 183-188, Sept. 1996.
[9] H.D. Tagare, K. Toyama, and J.G. Wang, “A Maximum-Likelihood Strategy for Directing Attention During Visual Search,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 5, pp. 490-500, May 2001.
[10] G. Backer, B. Mertsching, and M. Bollmann, “Data- and Model-Driven Gaze Control for an Active-Vision System,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 12, pp. 1415-1429, Dec. 2001.
[11] J. Denzler and C.M. Brown, “Information Theoretic Sensor Data Selection for Active Object Recognition and State Estimation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 2, pp. 145-157, Feb. 2002.
[12] A.A. Salah, E. Alpaydin, and L. Akarun, “A Selective Attention-Based Method for Visual Pattern Recognition with Application to Handwritten Digit Recognition and Face Recognition,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 3, pp. 420-424, Mar. 2002.
[13] G. Loy and A. Zelinsky, “Fast Radial Symmetry for Detecting Points of Interest,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 8, pp. 959-973, Aug. 2003.
[14] J.T. Chien and C.C. Wu, “Discriminant Waveletfaces and Nearest Feature Classifiers for Face Recognition,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 12, pp. 1644-1649, Dec. 2002.
[15] J. Kittler and F.M. Alkoot, “Sum Versus Vote Fusion in Multiple Classifier Systems,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 1, pp. 110-115, Jan. 2003.
[16] D. Windridge and J. Kittler, “A Morphologically Optimal Strategy for Classifier Combination: Multiple Expert Fusion as a Tomographic Process,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 3, pp. 343-353, Mar. 2003.
[17] A.J. Storkey and C.K.I. Williams, “Image Modeling with Position-Encoding Dynamic Trees,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 7, pp. 859-871, July 2003.
[18] M.K. Titsias and A. Likas, “Class Conditional Density Estimation Using Mixtures with Constrained Component Sharing,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 7, pp. 924-928, July 2003.
[19] S. Raudys, “Experts' Boasting in Trainable Fusion Rules,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 9, pp. 1178-1182, Sept. 2003.
[20] M. Bressan and J. Vitrià, “On the Selection and Classification of Independent Features,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 10, pp. 1312-1317, Oct. 2003.
[21] I.R. Vega and S. Sarkar, “Statistical Motion Model Based on the Change of Feature Relationships: Human Gait-Based Recognition,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 10, pp. 1323-1328, Oct. 2003.
[22] T. Kirishima, K. Sato, and K. Chihara, “Realtime Gesture Recognition under the Multilayered Parallel Recognition Framework of QVIPS,” Proc. Third IEEE Int'l Conf. Automatic Face and Gesture Recognition (FG '98), pp. 579-584, Apr. 1998.
[23] T. Kirishima, K. Sato, and K. Chihara, “Real-Time Gesture Recognition by Learning of Gesture Protocols,” IEICE (D-II) Trans., vol. J81-D-II, no. 5, pp. 785-794, May 1998.
[24] I.S. MacKenzie and C. Ware, “Lag as a Determinant of Human Performance in Interactive Systems,” Proc. Conf. Human Factors in Computing Systems (INTERCHI '93), pp. 488-493, Apr. 1993.
[25] T. Kirishima, K. Sato, and K. Chihara, “Realtime Gesture Recognition by Selective Control of Visual Interest Points,” IEICE (D-II) Trans., vol. J84-D-II, no. 11, pp. 2398-2407, Nov. 2001.
[26] C. Rasmussen and G.D. Hager, “Probabilistic Data Association Methods for Tracking Complex Visual Objects,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 6, pp. 560-576, June 2001.
[27] C.S. Wiles, A. Maki, and N. Matsuda, “Hyperpatches for 3D Model Acquisition and Tracking,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 12, pp. 1391-1403, Dec. 2001.
[28] T. Drummond and R. Cipolla, “Real-Time Visual Tracking of Complex Structures,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 932-946, July 2002.
[29] Y. Song, L. Goncalves, and P. Perona, “Unsupervised Learning of Human Motion,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 7, pp. 814-827, July 2003.
[30] A. Mohan, C. Papageorgiou, and T. Poggio, “Example-Based Object Detection in Images by Components,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 4, pp. 349-361, Apr. 2001.
[31] J. Triesch and C. Malsburg, “A System for Person-Independent Hand Posture Recognition against Complex Backgrounds,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 12, pp. 1449-1453, Dec. 2001.
[32] A.D. Jepson, D.J. Fleet, and T.F. El-Maraghi, “Robust Online Appearance Models for Visual Tracking,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 10, pp. 1296-1311, Oct. 2003.
[33] A.F. Bobick and J.W. Davis, “The Recognition of Human Movement Using Temporal Templates,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 3, pp. 257-267, Mar. 2001.
[34] B. Moghaddam and M.H. Yang, “Learning Gender with Support Faces,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 5, pp. 707-711, May 2002.
[35] J. Yamato, S. Kurakake, A. Tomono, and K. Ishii, “Human Action Recognition Using HMM with Category-Separated Vector Quantization,” IEICE (D-II) Trans., vol. J77-D-II, no. 7, pp. 1311-1318, July 1994.
[36] P. Bharadwaj and L. Carin, “Infrared-Image Classification Using Hidden Markov Trees,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 10, pp. 1394-1398, Oct. 2002.
[37] M. Bicego and V. Murino, “Investigating Hidden Markov Models' Capabilities in 2D Shape Classification,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 2, pp. 281-286, Feb. 2004.
[38] M.H. Yang, “Extraction of 2D Motion Trajectories and Its Application to Hand Gesture Recognition,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 8, pp. 1061-1074, Aug. 2002.
[39] A.D. Wilson and A.F. Bobick, “Configuration States for the Representation and Recognition of Gesture,” Proc. Int'l Workshop Automatic Face and Gesture Recognition, pp. 129-134, 1995.
[40] D.S. Yeung and X.Z. Wang, “Improving Performance of Similarity-Based Clustering by Feature Weight Learning,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 4, pp. 556-561, Apr. 2002.
[41] E. Hunter, J. Schlenzig, and R. Jain, “Posture Estimation in Reduced-Model Gesture Input Systems,” Proc. Int'l Workshop Automatic Face and Gesture Recognition, pp. 290-295, 1995.
[42] S.X. Liao and M. Pawlak, “On the Accuracy of Zernike Moments for Image Analysis,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 12, pp. 1358-1364, Dec. 1998.
[43] W.T. Freeman and M. Roth, “Orientation Histograms for Hand Gesture Recognition,” Proc. Int'l Workshop Automatic Face and Gesture Recognition, pp. 296-301, 1995.
[44] I.S. MacKenzie and C. Ware, “Lag as a Determinant of Human Performance in Interactive Systems,” Proc. Human Factors in Computing Systems (INTERCHI '93), pp. 488-493, Apr. 1993.
[45] H. Deng and D.A. Clausi, “Gaussian MRF Rotation-Invariant Features for Image Classification,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 7, pp. 951-955, July 2004.
[46] F. Quek, “Toward a Vision-Based Hand Gesture Interface,” Proc. Virtual Reality System Technology Conf., pp. 17-29, Aug. 1994.
[47] T. Kirishima, K. Sato, and K. Chihara, “A Novel Approach on Gesture Recognition: The Gesture Protocol-Based Gesture Interface,” Proc. Int'l Conf. Virtual Systems and Multimedia (VSMM '96), pp. 433-438, Sept. 1996.

Index Terms:
Gesture recognition, selective control, visual interest points, Gaussian density feature, real-time interaction.
Citation:
Toshiyuki Kirishima, Kosuke Sato, Kunihiro Chihara, "Real-Time Gesture Recognition by Learning and Selective Control of Visual Interest Points," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 3, pp. 351-364, March 2005, doi:10.1109/TPAMI.2005.61
Usage of this product signifies your acceptance of the Terms of Use.