Issue No. 07 - July (2006 vol. 28)
Jitendra Malik , IEEE
The problem we consider in this paper is to take a single two-dimensional image containing a human figure, locate the joint positions, and use these to estimate the body configuration and pose in three-dimensional space. The basic approach is to store a number of exemplar 2D views of the human body in a variety of different configurations and viewpoints with respect to the camera. On each of these stored views, the locations of the body joints (left elbow, right knee, etc.) are manually marked and labeled for future use. The input image is then matched to each stored view, using the technique of shape context matching in conjunction with a kinematic chain-based deformation model. Assuming that there is a stored view sufficiently similar in configuration and pose, the correspondence process will succeed. The locations of the body joints are then transferred from the exemplar view to the test shape. Given the 2D joint locations, the 3D body configuration and pose are then estimated using an existing algorithm. We can apply this technique to video by treating each frame independently—tracking just becomes repeated recognition. We present results on a variety of data sets.
Shape, object recognition, tracking, human body pose estimation.
Jitendra Malik, Greg Mori, "Recovering 3D Human Body Configurations Using Shape Contexts", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 28, no. , pp. 1052-1062, July 2006, doi:10.1109/TPAMI.2006.149