Issue No. 03 - March (2012 vol. 34)
Zhe Lin , Adv. Technol. Labs., Adobe Syst. Inc., San Jose, CA, USA
Zhuolin Jiang , Inst. for Adv. Comput. Studies, Univ. of Maryland, College Park, MD, USA
Larry S. Davis , Inst. for Adv. Comput. Studies, Univ. of Maryland, College Park, MD, USA
A shape-motion prototype-based approach is introduced for action recognition. The approach represents an action as a sequence of prototypes for efficient and flexible action matching in long video sequences. During training, an action prototype tree is learned in a joint shape and motion space via hierarchical K-means clustering and each training sequence is represented as a labeled prototype sequence; then a look-up table of prototype-to-prototype distances is generated. During testing, based on a joint probability model of the actor location and action prototype, the actor is tracked while a frame-to-prototype correspondence is established by maximizing the joint probability, which is efficiently performed by searching the learned prototype tree; then actions are recognized using dynamic prototype sequence matching. Distance measures used for sequence matching are rapidly obtained by look-up table indexing, which is an order of magnitude faster than brute-force computation of frame-to-frame distances. Our approach enables robust action matching in challenging situations (such as moving cameras, dynamic backgrounds) and allows automatic alignment of action sequences. Experimental results demonstrate that our approach achieves recognition rates of 92.86 percent on a large gesture data set (with dynamic backgrounds), 100 percent on the Weizmann action data set, 95.77 percent on the KTH action data set, 88 percent on the UCF sports data set, and 87.27 percent on the CMU action data set.
video signal processing, image matching, image recognition, image sequences, learning (artificial intelligence), pattern clustering, table lookup, learning, human action recognition, shape-motion prototype-based approach, flexible action matching, video sequences, joint shape, motion space, hierarchical k-means clustering, training sequence, prototype-to-prototype distances, joint probability model, actor location, action prototype, frame-to-prototype correspondence, dynamic prototype sequence matching, distance measures, look-up table indexing, brute-force computation, frame-to-frame distances, moving cameras, dynamic backgrounds, large gesture data set, Weizmann action data set, KTH action data set, UCF sports data set, CMU action data set, Prototypes, Shape, Feature extraction, Humans, Hidden Markov models, Joints, Training, dynamic time warping., Action recognition, shape-motion prototype tree, hierarchical K-means clustering, joint probability
Zhe Lin, Zhuolin Jiang and L. S. Davis, "Recognizing Human Actions by Learning and Matching Shape-Motion Prototype Trees," in IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 34, no. , pp. 533-547, 2012.