This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Human Activity Recognition Using Multidimensional Indexing
August 2002 (vol. 24 no. 8)
pp. 1091-1104

In this paper, we develop a novel method for view-based recognition of human action/activity from videos. By observing just a few frames, we can identify the activity that takes place in a video sequence. The basic idea of our method is that activities can be positively identified from a sparsely sampled sequence of a few body poses acquired from videos. In our approach, an activity is represented by a set of pose and velocity vectors for the major body parts (hands, legs, and torso) and stored in a set of multidimensional hash tables. We develop a theoretical foundation that shows that robust recognition of a sequence of body pose vectors can be achieved by a method of indexing and sequencing and it requires only a few pose vectors (i.e., sampled body poses in video frames). We find that the probability of false alarm drops exponentially with the increased number of sampled body poses. So, matching only a few body poses guarantees high probability for correct recognition. Our approach is parallel, i.e., all possible model activities are examined at one indexing operation since all of the model activities are stored in the same set of hash tables. In addition, our method is robust to partial occlusion since each body part is indexed separately. We use a sequence-based voting approach to recognize the activity invariant to the activity speed. Experiments performed with videos having eight different activities show robust recognition with our method. The method is also robust in conditions of varying view angle in the range of \pm 30 degrees.

[1] A.F. Bobick and J.W. Davis, “An Appearance Based Representation of Action,” Proc. 13th Int'l Conf. Pattern Recognition, Aug. 1996.
[2] A.F. Bobick and J.W. Davis, “The Recognition of Human Movement Using Temporal Templates,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 3, Mar. 2001.
[3] J. Ben-Arie and K.R. Rao, “A Novel Approach for Template Matching by Nonorthogonal Image Expansion,” IEEE Trans. Circuits and Systems for Video Technology, vol. 3, no. 1, pp. 71-84, 1993.
[4] J. Ben-Arie and K. R. Rao, “Optimal Template Matching by Non-Orthogonal Image Expansion Using Restoration,” Int'l J. Machine Vision and Applications, vol. 7, no. 2, pp. 69-81, Mar. 1994.
[5] E. Di Bernardo, L. Goncalves, and P. Perona, “Monocular Tracking of the Human Arm in 3D: Real-Time Implementation and Experiments,” Proc. Int'l Conf. Pattern Recognition, pp. 622-626, Aug. 1996.
[6] M. La Cascia, J. Isidoro, and S. Sclaroff, "Head Tracking via Robust Registration in Texture Map Images," Proc. Int'l Conf. Computer Vision and Pattern Recognition (CVPR 98), IEEE CS Press, Los Alamitos, Calif., 1998, pp. 508-514.
[7] C. Barrón and I.A. Kakadiaris, “Estimating Anthropometry and Pose from a Single Image,” Proc. Conf. Computer Vision and Pattern Recognition, pp. 669–676, June 2000.
[8] H. Fujiyoshi and A.J. Lipton, “Real-Time Human Motion Analysis by Image Skeletonization,” Proc. Workshop Application of Computer Vision, Oct. 1998.
[9] A. Galata, N. Johnson, and D. Hogg, “Learning Variable-Length Markov Models of Behaviour,” Computer Vision and Image Understanding, vol. 81, no. 3, pp. 398-413, Mar. 2001.
[10] D.M. Gavrila, “The Visual Analysis of Human Movement: A Survey,” Computer Vision and Image Understanding, vol. 73, no. 1, Jan. 1999.
[11] I. Haritaoglu, D. Harwood, and L. Davis, $\rm W^4$: Real-Time Surveillance of People and Their Activities IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 809-830, Aug. 2000.
[12] Y.A. Ivanov and A.F. Bobick, “Recognition of Visual Activities and Interactions by Stochastic Parsing,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 852-871, Aug. 2000.
[13] S.X. Ju, M.J. Black, and Y. Yacoob, “Cardboard People: A Parameterized Model of Articulated Image Motion,” Proc. Second Int'l Conf. Automatic Face- and Gesture-Recognition, pp. 38-44, Oct. 1996.
[14] M.K. Leung and Y.H. Yang, “First Sight: A Human Body Outline Labeling System,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 17, no. 4, Apr. 1995.
[15] A. Lipton, H. Fujiyoshi, and R. Patil, “Moving Target Detection and Classification from Real-Time Video,” Proc. 1998 DARPA Image Understanding Workshop (IUW '98), Nov. 1998.
[16] D. Marr and H.K. Nishihara, “Representation and Recognition of the Spatial Organization of Three-Dimensional Shapes,” Proc. Royal Soc. London B, vol. 200, pp. 269-294, 1978.
[17] T.B. Moeslund and E. Granum, “A Survey of Computer Vision-Based Human Motion Capture,” Computer Vision and Image Understanding, vol. 81, no. 3, pp. 231-268, Mar. 2001.
[18] T.K. Moon and W.C. Stirling, Mathematical Methods and Algorithms for Signal Processing. Prentice Hall, 2000.
[19] A. Pentland and B. Horowitz, “Recovery of Non-Rigid Motion and Structure,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 13, no. 7, pp. 730-742, July 1991.
[20] J.M. Rehg and T. Kanade, “Model-Based Tracking of Self-Occluding Articulated Objects,” Proc. Fifth Int'l Conf. Computer Vision, pp. 612–617, June 1995.
[21] K. Rohr, “Towards Model Based Recognition of Human Movements in Image Sequences,” CVGIP: Image Understanding, vol. 59, 1994.
[22] R. Polana and R. Nelson, “Recognizing Activities,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 815-818, 1994.
[23] N. Sawasaki, T. Morita, and T. Uchiyama, “Design and Implementation of High-Speed Visual Tracking System for Real-Time Motion Analysis,” Proc. Int'l Conf. Pattern Recognition, pp. 478-484, 1996.
[24] J. Schlenzig, E. Hunter, and R. Jain, “Vision Based Hand Gesture Interpretation Using Recursive Estimation,” Proc. 28th Asilomar Conf. Signals, Systems, and Computers, 1994.
[25] Z. Wang and J. Ben-Arie, “Optimal Ramp Edge Detection Using Expansion Matching,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 11, pp. 1092-1098, Nov. 1996.
[26] C. Wren, A. Azarbayejani, T. Darrell, and A.P. Pentland, Pfinder: Real-Time Tracking of the Human Body IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 780-785, July 1997.
[27] J. Yamato, H. Ohya, and K. Ishii, “Recognizing Human Action in Time-Sequential Images Using Hidden Markov Model,” Proc. 1992 IEEE Conf. Computer Vision and Pattern Recognition, pp. 379-385, 1992.
[28] M.-H. Yang and N. Ahuja, “Recognizing Hand Gesture Using Motion Trajectories,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 466-472, June 1999.

Index Terms:
Human activity recognition, multidimensional indexing, sequence recognition, human body part tracking, EXpansion Matching (EXM).
Citation:
Jezekiel Ben-Arie, Zhiqian Wang, Purvin Pandit, Shyamsundar Rajaram, "Human Activity Recognition Using Multidimensional Indexing," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 8, pp. 1091-1104, Aug. 2002, doi:10.1109/TPAMI.2002.1023805
Usage of this product signifies your acceptance of the Terms of Use.