This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Recovering 3D Human Body Configurations Using Shape Contexts
July 2006 (vol. 28 no. 7)
pp. 1052-1062
The problem we consider in this paper is to take a single two-dimensional image containing a human figure, locate the joint positions, and use these to estimate the body configuration and pose in three-dimensional space. The basic approach is to store a number of exemplar 2D views of the human body in a variety of different configurations and viewpoints with respect to the camera. On each of these stored views, the locations of the body joints (left elbow, right knee, etc.) are manually marked and labeled for future use. The input image is then matched to each stored view, using the technique of shape context matching in conjunction with a kinematic chain-based deformation model. Assuming that there is a stored view sufficiently similar in configuration and pose, the correspondence process will succeed. The locations of the body joints are then transferred from the exemplar view to the test shape. Given the 2D joint locations, the 3D body configuration and pose are then estimated using an existing algorithm. We can apply this technique to video by treating each frame independently—tracking just becomes repeated recognition. We present results on a variety of data sets.

[1] D.M. Gavrila, “The Visual Analysis of Human Movement: A Survey,” Computer Vision and Image Understanding: CVIU, vol. 73, no. 1, pp. 82-98, 1999.
[2] S. Belongie, J. Malik, and J. Puzicha, “Shape Matching and Object Recognition Using Shape Contexts,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 4, pp. 509-522, Apr. 2002.
[3] C.J. Taylor, “Reconstruction of Articulated Objects from Point Correspondences in a Single Uncalibrated Image,” Computer Vision and Image Understanding, vol. 80, pp. 349-363, 2000.
[4] J. O'Rourke and N. Badler, “Model-Based Image Analysis of Human Motion Using Constraint Propagation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 2, no. 6, pp. 522-536, 1980.
[5] D. Hogg, “Model-Based Vision: A Program to See a Walking Person,” Image and Vision Computing, vol. 1, no. 1, pp. 5-20, 1983.
[6] M. Yamamoto and K. Koshikawa, “Human Motion Analysis Based on a Robot Arm Model,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, pp. 664-665, 1991.
[7] J.M. Rehg and T. Kanade, “Visual Tracking of High DOF Articulated Structures: An application to Human Hand Tracking,” Lecture Notes in Computer Science, vol. 800, pp. 35-46, 1994.
[8] C. Bregler and J. Malik, “Tracking People with Twists and Exponential Maps,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, pp. 8-15, 1998.
[9] I. Kakadiaris and D. Metaxas, “Model-Based Estimation of 3D Human Motion,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 12, pp. 1453-1459, Dec. 2000.
[10] D. Gavrila and L. Davis, “3D Model-Based Tracking of Humans in Action: A MultiView Approach,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, pp. 73-80, 1996.
[11] K. Rohr, “Incremental Recognition of Pedestrians from Image Sequences,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, pp. 8-13, 1993.
[12] H. Sidenbladh and M.J. Black, “Learning the Statistics of People Learning the Statistics of People in Images and Video,” Int'l J. Computer Vision, vol. 54, nos. 1-3, pp. 183-209, 2003.
[13] J. Deutscher, A.J. Davison, and I.D. Reid, “Automatic Partitioning of High Dimensional Search Spaces Associated with Articulated Body Motion Capture,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 669-676, Dec. 2001.
[14] K. Choo and D.J. Fleet, “People Tracking Using Hybrid Monte Carlo Filtering,” Proc. Eighth Int'l Conf. Computer Vision, vol. 2, pp. 321-328, 2001.
[15] C. Sminchisescu and B. Triggs, “Hyperdynamic Importance Sampling,” Proc. European Conf. Computer Vision, vol. 1, pp. 769-783, 2002.
[16] M.W. Lee and I. Cohen, “Proposal Maps Driven MCMC for Estimating Human Body Pose in Static Images,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 334-341, 2004.
[17] A. Baumberg and D. Hogg, “Learning Flexible Models from Image Sequences,” Lecture Notes in Computer Science, vol. 800, pp. 299-308, 1994.
[18] C. Wren, A. Azarbayejani, T. Darrell, and A. Pentland, “Pfinder: Real-Time Tracking of the Human Body,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 780-785, July 1997.
[19] D. Morris and J. Rehg, “Singularity Analysis for Articulated Object Tracking,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, pp. 289-296, 1998.
[20] S. Ioffe and D. Forsyth, “Human Tracking with Mixtures of Trees,” Proc. Eighth Int'l Conf. Computer Vision, vol. 1, pp. 690-695, 2001.
[21] D. Ramanan and D.A. Forsyth, “Using Temporal Coherence to Build Models of Animals,” Proc. Ninth Int'l Conf. Computer Vision, vol. 1, pp. 338-345, 2003.
[22] Y. Song, L. Goncalves, and P. Perona, “Unsupervised Learning of Human Motion,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 7, pp. 814-827, July 2003.
[23] M. Brand, “Shadow Puppetry,” Proc. Seventh Int'l Conf. Computer Vision, vol. 2, pp. 1237-1244, 1999.
[24] K. Toyama and A. Blake, “Probabilistic Exemplar-Based Tracking in a Metric Space,” Proc. Eighth Int'l Conf. Computer Vision, vol. 2, pp. 50-57, 2001.
[25] J. Sullivan and S. Carlsson, “Recognizing and Tracking Human Action,” Proc. European Conf. Computer Vision, vol. 1, pp. 629-644, 2002.
[26] G. Mori and J. Malik, “Estimating Human Body Configurations Using Shape Context Matching,” Proc. European Conf. Computer Vision, vol. 3, pp. 666-680, 2002.
[27] R. Rosales and S. Sclaroff, “Learning Body Pose via Specialized Maps,” Neural Information Processing Systems NIPS-14, 2002.
[28] K. Grauman, G. Shakhnarovich, and T. Darrell, “Inferring 3D Structure with a Statistical Image-Based Shape Model,” Proc. Ninth Int'l Conf. Computer Vision, 2003.
[29] I. Haritaoglu, D. Harwood, and L.S. Davis, “Ghost: A Human Body Part Labeling System Using Silhouettes,” Proc. Int'l Conf. Pattern Recognition, 1998.
[30] H.J. Lee and Z. Chen, “Determination of 3D Human Body Posture from a Single View,” Proc. Computer Vision, Graphics, Image Processing, vol. 30, pp. 148-168, 1985.
[31] Z. Chen and H.J. Lee, “Knowledge-Guided Visual Perception of 3-D Human Gait from a Single Image Sequence,” IEEE Trans. Systems, Man, and Cybernetics, vol. 22, no. 2, pp. 336-342, 1992.
[32] C.I. Attwood, G.D. Sullivan, and K.D. Baker, “Model-Based Recognition of Human Posture Using Single Synthetic Images,” Proc. Fifth Alvey Vision Conf., 1989.
[33] J. Ambrósio, J. Abrantes, and G. Lopes, “Spatial Reconstruction of Human Motion by Means of a Single Camera and a Biomechanical Model,” Human Movement Science, vol. 20, pp. 829-851, 2001.
[34] C. Barrón and I.A. Kakadiaris, “Estimating Anthropometry and Pose from a Single Uncalibrated Image,” Proc. Computer Vision and Image Understanding (Computer Vision and Image Understanding), vol. 81, pp. 269-284, 2001.
[35] D. Martin, C. Fowlkes, and J. Malik, “Learning to Find Brightness and Texture Boundaries in Natural Images,” Neural Information Processing Systems, 2002.
[36] G. Mori and J. Malik, “Recognizing Objects in Adversarial Clutter: Breaking a Visual Captcha,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 134-141, 2003.
[37] T. Cormen, C. Leiserson, and R. Rivest, Introduction to Algorithms. The MIT Press, 1990.
[38] G. Mori, S. Belongie, and J. Malik, “Efficient Shape Matching Using Shape Contexts,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 11, pp. 1832-1837, Nov. 2005.
[39] R. Gross and J. Shi, “The CMU Motion of Body (MoBo) Database,” Technical Report CMU-RI-TR-01-18, Robotics Inst., Carnegie Mellon Univ., 2001.
[40] G. Shakhnarovich, P. Viola, and T. Darrell, “Fast Pose Estimation with Parameter Sensitive Hashing,” Proc. Ninth Int'l Conf. Computer Vision, vol. 2, pp. 750-757, 2003.
[41] G. Mori, X. Ren, A. Efros, and J. Malik, “Recovering Human Body Configurations: Combining Segmentation and Recognition,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 326-333, 2004.

Index Terms:
Shape, object recognition, tracking, human body pose estimation.
Citation:
Greg Mori, Jitendra Malik, "Recovering 3D Human Body Configurations Using Shape Contexts," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 7, pp. 1052-1062, July 2006, doi:10.1109/TPAMI.2006.149
Usage of this product signifies your acceptance of the Terms of Use.