This Article 
 Bibliographic References 
 Add to: 
The Recognition of Human Movement Using Temporal Templates
March 2001 (vol. 23 no. 3)
pp. 257-267

Abstract—A new view-based approach to the representation and recognition of human movement is presented. The basis of the representation is a temporal template—a static vector-image where the vector value at each point is a function of the motion properties at the corresponding spatial location in an image sequence. Using aerobics exercises as a test domain, we explore the representational power of a simple, two component version of the templates: The first value is a binary value indicating the presence of motion and the second value is a function of the recency of motion in a sequence. We then develop a recognition method matching temporal templates against stored instances of views of known actions. The method automatically performs temporal segmentation, is invariant to linear changes in speed, and runs in real-time on standard platforms.

[1] J.K. Aggarwal and N. Nandhakumar, “On the Computation of Motion from Sequences of Images: A Review,” Proc. IEEE, vol. 76, no. 8, pp. 917-935, 1988.
[2] J.K. Aggarwal and Q. Cai, “Human Motion Analysis: A Review,” Computer Vision and Image Understanding, vol. 73, no. 3, pp. 428-440, 1999.
[3] K. Akita, “Image Sequence Analysis of Real World Human Motion,” Pattern Recognition, vol. 17, no. 1, pp. 73-83, 1984.
[4] M.J. Black and Y. Yacoob, "Tracking and Recognizing Rigid and Non-Rigid Facial Motions Using Local Parametric Model of Image Motion," Proc. Int'l Conf. Computer Vision, pp. 374-381,Cambridge, Mass., 1995.
[5] A.F. Bobick, S.S. Intille, J.W. Davis, F. Baird, L.W. Campbell, Y. Ivanov, C.S. Pinhanez, A. Schütte, and A. Wilson, “The KidsRoom: A Perceptually-Based Interactive and Immersive Story Environment,” Presence, vol. 8, no. 4, pp. 368-393, Aug. 1999.
[6] A. Bobick, “Movement, Activity, and Action: The Role of Knowledge in the Perception of Motion,” Philosophical Trans. Royal Soc. London, vol. 352, pp. 1257-1265, 1997.
[7] A.F. Bobick and J.W. Davis, “An Appearance Based Representation of Action,” Proc. 13th Int'l Conf. Pattern Recognition, Aug. 1996.
[8] L.W. Campbell and A.F. Bobick, “Recognition of Human Body Motion Using Phase Space Constraints,” Proc. Int'l Conf. Computer Vision, 1995.
[9] Y. Cui, D. Swets, and J. Weng, "Learning-Based Hand Sign Recognition Using SHOSLIF-M," Int'l Conf. on Computer Vision, 1995, pp. 631-636.
[10] T. Darrell and A. Pentland, “Space-Time Gestures,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 335-340, 1993.
[11] J.W. Davis and A.F. Bobick, “The Representation and Recognition of Human Movement Using Temporal Templates,” IEEE Proc. Computer Vision and Pattern Recognition, pp. 928-934, June 1997.
[12] I.A. Essa and A.P. Pentland, “Coding, Analysis, Interpretation, and Recognition of Facial Expressions,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 757-763, July 1997.
[13] W. Freeman and M. Roth, “Orientation Histogram for Hand Gesture Recognition,” Proc. Int'l Workshop Automatic Face and Gesture Recognition, pp. 296-301, 1995.
[14] D.M. Gavrila and L.S. Davis, “3-D Model-Based Tracking of Humans in Action: A Multi-View Approach,” Proc. Conf. Computer Vision and Pattern Recognition, pp. 73–80, June 1996.
[15] L. Goncalves, E. DiBernardom, E. Ursella, and P. Perona, “Monocular Tracking of The Human Arm in 3D,” Proc. Fifth Int'l Conf. Computer Vision, pp. 764–770, June 1995.
[16] D. Hogg, “Model-Based Vision: A Paradigm to See a Walking Person,” Image and Vision Computing, vol. 1, no. 1, pp. 5-20, 1983.
[17] M. Hu, “Visual Pattern Recognition by Moment Invariants,” IRE Trans. Information Theory, vol. 8, no. 2, pp. 179-187, 1962.
[18] D. Jones and J. Malik, “Computational Framework to Determining Stereo Correspondence from a Set of Linear Spatial Filters,” Image and Vision Computing, vol. 10, no. 10, pp. 699-708, Dec. 1992.
[19] S.X. Ju, M.J. Black, and Y. Yacoob, “Cardboard People: A Parameterized Model of Articulated Image Motion,” Proc. Second Int'l Conf. Automatic Face- and Gesture-Recognition, pp. 38-44, Oct. 1996.
[20] J. Little and J. Boyd, “Describing Motion for Recognition,” Int'l Symp. Computer Vision, pp. 235-240, Nov. 1995.
[21] P. Maes et al., "The Alive System: Full-Body Interaction with Animated Autonomous Agents," ACM Multimedia Systems, Vol. 5, No.2, 1997, pp. 105-112.
[22] P. Rodriguez and S. Sibal, “Spread: Scalable Platform for Reliable and Efficient Automated Distribution,” Computer Networks, vol. 33, nos. 1-6, pp. 33-49, June 2000.
[23] J.M. Rehg and T. Kanade, “Model-Based Tracking of Self-Occluding Articulated Objects,” Proc. Fifth Int'l Conf. Computer Vision, pp. 612–617, June 1995.
[24] K. Rohr, “Towards Model Based Recognition of Human Movements in Image Sequences,” CVGIP: Image Understanding, vol. 59, 1994.
[25] E. Shavit and A. Jepson, “Motion Understanding Using Phase Portraits,” Proc. IJCAI Workshop: Looking at People, 1993.
[26] J.M. Siskind, “Grounding Language in Perception,” Artificial Intelligence Rev., vol. 8, pp. 371-391, 1995.
[27] A.D. Wilson and A.F. Bobick, “Learning Visual Behavior for Gesture Analysis,” Proc. IEEE Int'l. Symp. Computer Vision, Coral Gables, Fla., Nov. 1995.
[28] Y. Yacoob and L.S. Davis, “Recognizing Human Facial Expression from Long Image Sequences Using Optical Flow,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 6, pp. 636-642, June 1996.
[29] J. Yamato, H. Ohya, and K. Ishii, “Recognizing Human Action in Time-Sequential Images Using Hidden Markov Model,” Proc. 1992 IEEE Conf. Computer Vision and Pattern Recognition, pp. 379-385, 1992.

Index Terms:
Motion recognition, computer vision.
Aaron F. Bobick, James W. Davis, "The Recognition of Human Movement Using Temporal Templates," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 3, pp. 257-267, March 2001, doi:10.1109/34.910878
Usage of this product signifies your acceptance of the Terms of Use.