This Article 
 Bibliographic References 
 Add to: 
A State-Based Approach to the Representation and Recognition of Gesture
December 1997 (vol. 19 no. 12)
pp. 1325-1337

Abstract—A state-based technique for the representation and recognition of gesture is presented. We define a gesture to be a sequence of states in a measurement or configuration space. For a given gesture, these states are used to capture both the repeatability and variability evidenced in a training set of example trajectories. Using techniques for computing a prototype trajectory of an ensemble of trajectories, we develop methods for defining configuration states along the prototype and for recognizing gestures from an unsegmented, continuous stream of sensor data. The approach is illustrated by application to a range of gesture-related sensory data: the two-dimensional movements of a mouse input device, the movement of the hand measured by a magnetic spatial position and orientation sensor, and, lastly, the changing eigenvector projection coefficients computed from an image sequence.

[1] A. Bobick and A. Wilson, “A State-Based Technique for the Summarization and Recognition of Gesture,” Proc. Fifth Int'l Conf. Computer Vision, pp. 382-388, 1995.
[2] C. Bregler and S.M. Omohundro, "Nonlinear Image Interpolation Using Surface Learning," G. Tesauro, J.D. Cowan, and J. Alspector, eds., Advances in Neural Information Processing Systems, vol. 6, pp. 43-50.San Francisco: Morgan Kaufmann Publishers, 1994.
[3] E. Catmull and R. Rom, "A Class of Local Interpolating Splines," R. Barnhill and R. Riesenfeld, eds., Computer Aided Geometric Design, pp. 317-326.San Francisco: Academic Press, 1974.
[4] C. Cedras and M. Shah, "Motion-Based Recognition: A Survey," Image and Vision Computing, vol. 13, no. 2, pp. 129-155, Mar. 1995.
[5] Y. Cui and J. Weng, "Learning-Based Hand Sign Recognition," Proc. Int'l Workshop Automatic Face- and Gesture-Recognition,Zurich, 1995.
[6] T. Darrell and A. Pentland, “Space-Time Gestures,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 335-340, 1993.
[7] J.W. Davis and M. Shah, "Gesture Recognition," Proc. European Conf. Comp. Vis., pp. 331-340, 1994.
[8] K. Gould and M. Shah, "The Trajectory Primal Sketch: A Multi-Scale Scheme for Representing Motion Characteristics," Proc. IEEE Conf. Comp. Vision and Pattern Recognition, pp. 79-85, June 1989.
[9] T. Hastie and W. Stuetzle, "Principal Curves," J. Amer. Statistical Assoc., vol. 84, no. 406, pp. 502-516, 1989.
[10] G. Johansson, "Visual Perception of Biological Motion and a Model for Its Analysis," Perception and Psychophysics, vol. 14, no. 2, pp. 201-211, 1973.
[11] A. Kendon, "How Gestures Can Become Like Words," F. Poyatos, ed., Cross-Cultural Perspectives in Nonverbal Communication.New York: C.J. Hogrefe, 1988.
[12] J.S. Lipscomb, "A Trainable Gesture Recognizer," Pattern Recognition, vol. 24, no. 9, pp. 895-907, 1991.
[13] K.V. Mardia, N.M. Ghali, M. Howes, T.J. Hainsworth, and N. Sheehy, "Techniques for Online Gesture Recognition on Workstations," Image and Vision Computing, vol. 11, no. 5, pp. 283-294, 1993.
[14] H. Murase and S.K. Nayar, "Learning and Recognition of 3D Objects from Appearance," Proc. IEEE Qualitative Vision Workshop, New York, pp. 39-49, 1993.
[15] H. Murase and S. Nayar, "Visual Learning and Recognition of 3D Objects From Appearance," Int'l J. Comp. Vis., vol. 14, pp. 5-24, 1995.
[16] P. Rodriguez and S. Sibal, “Spread: Scalable Platform for Reliable and Efficient Automated Distribution,” Computer Networks, vol. 33, nos. 1-6, pp. 33-49, June 2000.
[17] L.R. Rabiner and B.H. Juang, Fundamentals of Speech Recognition.Englewood Cliffs, N.J.: Prentice Hall, 1993.
[18] K. Rangarajan, W. Allen, and M. Shah, "Matching Motion Trajectories Using Scale-Space," Pattern Recognition, vol. 26 no. 4 pp. 595-610 1993.
[19] K. Rohr, "Towards Model-Based Recognition of Human Movements in Image Sequences," Comp. Vis., Graph., and Img. Proc., vol. 59, no. 1, pp. 94-115, 1994.
[20] J. Schlenzig, E. Hunter, and R. Jain, "Recursive Identification of Gesture Inputs Using Hidden Markov Models," Proc. Second IEEE Workshop on Applications of Computer Vision,Sarasota, Fla., pp. 187-194, Dec.5-7, 1994.
[21] J. Schlenzig, E. Hunter, and R. Jain, “Vision Based Hand Gesture Interpretation Using Recursive Estimation,” Proc. 28th Asilomar Conf. Signals, Systems, and Computers, 1994.
[22] G. Sperling, M. Landy, Y. Cohen, and M. Pavel, "Intelligible Encoding of ASL Image Sequences at Extremely Low Information Rates," Comp. Vis., Graph., and Img. Proc., vol. 31, pp. 335-391, 1985.
[23] T.E. Starner and A. Pentland, "Visual Recognition of American Sign Language Using Hidden Markov Models," Proc. Int'l Workshop Automatic Face- and Gesture-Recognition,Zurich, 1995.
[24] A.I. Tew and C.J. Gray, "A Real-Time Gesture Recognizer Based on Dynamic Programming," J. Biomedical Eng., vol. 15, pp. 181-187, May 1993.
[25] M. Turk and A. Pentland, "Eigenfaces for Recognition," J. Cognitive Neuroscience, vol. 3, no. 1, pp. 71-86, 1991.
[26] A.D. Wilson and A.F. Bobick, “Learning Visual Behavior for Gesture Analysis,” Proc. IEEE Int'l. Symp. Computer Vision, Coral Gables, Fla., Nov. 1995.
[27] A.D. Wilson, A.F. Bobick, and J. Cassell, “Temporal Classification of Natural Gesture and Application to Video Coding,” Proc. Computer Vision and Pattern Recognition, pp. 948-954, 1997.
[28] J. Yamato, H. Ohya, and K. Ishii, “Recognizing Human Action in Time-Sequential Images Using Hidden Markov Model,” Proc. 1992 IEEE Conf. Computer Vision and Pattern Recognition, pp. 379-385, 1992.

Index Terms:
Gesture recognition, state-based representation, gesture prototype, motion-based recognition.
Aaron F. Bobick, Andrew D. Wilson, "A State-Based Approach to the Representation and Recognition of Gesture," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 12, pp. 1325-1337, Dec. 1997, doi:10.1109/34.643892
Usage of this product signifies your acceptance of the Terms of Use.