This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
From Canonical Poses to 3D Motion Capture Using a Single Camera
July 2010 (vol. 32 no. 7)
pp. 1165-1181
Andrea Fossati, Ecole Polytechnique Fédérale de Lausanne (EPLFL/IC/ISIM/CVLab), Lausanne
Miodrag Dimitrijevic, Ecole Polytechnique Fédérale de Lausanne (EPLFL/IC/ISIM/CVLab), Lausanne
Vincent Lepetit, Ecole Polytechnique Fédérale de Lausanne (EPLFL/IC/ISIM/CVLab), Lausanne
Pascal Fua, Ecole Polytechnique Fédérale de Lausanne (EPLFL/IC/ISIM/CVLab), Lausanne
We combine detection and tracking techniques to achieve robust 3D motion recovery of people seen from arbitrary viewpoints by a single and potentially moving camera. We rely on detecting key postures, which can be done reliably, using a motion model to infer 3D poses between consecutive detections, and finally refining them over the whole sequence using a generative model. We demonstrate our approach in the cases of golf motions filmed using a static camera and walking motions acquired using a potentially moving one. We will show that our approach, although monocular, is both metrically accurate because it integrates information over many frames and robust because it can recover from a few misdetections.

[1] A. Agarwal and B. Triggs, "3D Human Pose from Silhouettes by Relevance Vector Regression," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2004.
[2] A. Agarwal and B. Triggs, "Tracking Articulated Motion with Piecewise Learned Dynamical Models," Proc. European Conf. Computer Vision, May 2004.
[3] A.O. Balan and M.J. Black, "The Naked Truth: Estimating Body Shape under Clothing," Proc. European Conf. Computer Vision, Part II, pp. 15-29, 2008.
[4] L. Bo, C. Sminchisescu, A. Kanaujia, and D. Metaxas, "Fast Algorithms for Large Scale Conditional 3D Prediction," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[5] M. Brubaker, D. Fleet, and A. Hertzmann, "Physics-Based Person Tracking Using Simplified Lower-Body Dynamics," Proc. IEEE Conf. Computer Vision and Pattern Recognition, June 2007.
[6] M. Brubaker, A. Hertzmann, and D. Fleet, "Physics-Based Human Pose Trackng," Proc. NIPS Workshop Evaluation of Articulated Human Motion and Pose Estimation, 2006.
[7] K. Choo and D.J. Fleet, "People Tracking Using Hybrid Monte Carlo Filtering," Proc. Int'l Conf. Computer Vision, July 2001.
[8] A.J. Davison, J. Deutscher, and I.D. Reid, "Markerless Motion Capture of Complex Full-Body Movement for Character Animation," Proc. Eurographics Workshop Computer Animation and Simulation, 2001.
[9] J. Deutscher, A. Blake, and I. Reid, "Articulated Body Motion Capture by Annealed Particle Filtering," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 2126-2133, 2000.
[10] D.E. DiFranco, T.J. Cham, and J.M. Rehg, "Reconstruction of 3D Figure Motion from 2D Correspondences," Proc. IEEE Conf. Computer Vision and Pattern Recognition, Dec. 2001.
[11] M. Dimitrijevic, V. Lepetit, and P. Fua, "Human Body Pose Detection Using Bayesian Spatio-Temporal Templates," Computer Vision and Image Understanding, vol. 104, nos. 2/3, pp. 127-139, 2006.
[12] E.-J. Ong, A.S. Micilotta, R. Bowden, and A. Hilton, "Viewpoint Invariant Exemplar-Based 3D Human Tracking," Computer Vision and Image Understanding, vol. 104, nos. 2/3, pp. 178-189, 2006.
[13] A. Elgammal and C.S. Lee, "Inferring 3D Body Pose from Silhouettes Using Activity Manifold Learning," Proc. IEEE Conf. Computer Vision and Pattern Recognition, June 2004.
[14] A. Fossati and P. Fua, "Linking Pose and Motion," Proc. European Conf. Computer Vision, Oct. 2008.
[15] D. Gavrila and V. Philomin, "Real-Time Object Detection for 'Smart' Vehicles," Proc. Int'l Conf. Computer Vision, pp. 87-93, 1999.
[16] J. Giebel, D.M. Gavrila, and C. Schnorr, "A Bayesian Framework for Multi-Cue 3D Object Tracking," Proc. European Conf. Computer Vision, 2004.
[17] R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision. Cambridge Univ. Press, 2000.
[18] M. Isard and J. MacCormick, "Bramble: A Bayesian Multiple-Blob Tracker," Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 34-41, July 2001.
[19] C.S. Lee and A. Elgammal, "Body Pose Tracking from Uncalibrated Camera Using Supervised Manifold Learning," Proc. NIPS Workshop Evaluation of Articulated Human Motion and Pose Estimation, 2006.
[20] B. Leibe, E. Seemann, and B. Schiele, "Pedestrian Detection in Crowded Scenes," Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, June 2005.
[21] R. Li, M. Yang, S. Sclaroff, and T. Tian, "Evaluation of 3D Human Motion Tracking with a Coordinated Mixture of Factor Analyzers," Proc. NIPS Workshop Evaluation of Articulated Human Motion and Pose Estimation, 2006.
[22] G. Loy, M. Eriksson, J. Sullivan, and S. Carlsson, "Monocular 3D Reconstruction of Human Motion in Long Action Sequences," Proc. European Conf. Computer Vision, 2004.
[23] K. Mikolajczyk, R. Choudhury, and C. Schmid, "Face Detection in a Video Sequence—A Temporal Approach," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2001.
[24] G. Mori, X. Ren, A.A. Efros, and J. Malik, "Recovering Human Body Configurations: Combining Segmentation and Recognition," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2004.
[25] R. Navaratnam, A. Fitzgibbon, and R. Cipolla, "The Joint Manifold Model for Semi-Supervised Multi-Valued Regression," Proc. Int'l Conf. Computer Vision, Oct. 2007.
[26] C.F. Olson and D.P. Huttenlocher, "Automatic Target Recognition by Matching Oriented Edge Pixels," IEEE Trans. Image Processing, vol. 6, no. 1, pp. 103-113, Jan. 1997.
[27] D. Ormoneit, H. Sidenbladh, M.J. Black, and T. Hastie, "Learning and Tracking Cyclic Human Motion," Proc. Neural Information Processing Systems, pp. 894-900, 2001.
[28] D. Ramanan, A. Forsyth, and A. Zisserman, "Tracking People by Learning their Appearance," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 1, pp. 65-81, Jan. 2007.
[29] B. Rosenhahn, T. Brox, and H.P. Seidel, "Scaled Motion Dynamics for Markerless Motion Capture," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2007.
[30] G. Shakhnarovich, P. Viola, and T. Darrell, "Fast Pose Estimation with Parameter-Sensitive Hashing," Proc. Int'l Conf. Computer Vision, 2003.
[31] K. Shoemake, "Animating Rotation with Quaternion Curves," Proc. ACM SIGGRAPH, vol. 19, pp. 245-254, 1985.
[32] H. Sidenbladh and M.J. Black, "Learning the Statistics of People in Images and Video," Int'l J. Computer Vision, vol. 54, pp. 181-207, 2003.
[33] H. Sidenbladh, M.J. Black, and D.J. Fleet, "Stochastic Tracking of 3D Human Figures Using 2D Image Motion," Proc. European Conf. Computer Vision, June 2000.
[34] H. Sidenbladh, M.J. Black, and L. Sigal, "Implicit Probabilistic Models of Human Motion for Synthesis and Tracking," Proc. European Conf. Computer Vision, May 2002.
[35] L. Sigal and M.J. Black, "Humaneva: Synchronized Video and Motion Capture Dataset for Evaluation of Articulated Human Motion," technical report, Dept. of Computer Science, Brown Univ., 2006.
[36] G. Simon, A. Fitzgibbon, and A. Zisserman, "Markerless Tracking Using Planar Structures in the Scene," Proc. Int'l Symp. Mixed and Augmented Reality, pp. 120-128, Oct. 2000.
[37] C. Sminchisescu, A. Kanaujia, Z. Li, and D. Metaxas, "Discriminative Density Propagation for 3D Human Motion Estimation," Proc. IEEE Conf. Computer Vision and Pattern Recognition, June 2005.
[38] J. Sullivan and S. Carlsson, "Recognizing and Tracking Human Action," Proc. European Conf. Computer Vision, 2002.
[39] L. Taycher, G. Shakhnarovich, D. Demirdjian, and T. Darrell, "Conditional Random People: Tracking Humans with CRFs and Grid Filters," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2006.
[40] A. Thayananthan, B. Stenger, P.H.S. Torr, and R. Cipolla, "Tracking Articulated Hand Motion Using a Kinematic Prior," Proc. British Machine Vision Conf., pp. 589-598, 2003.
[41] C. Tomasi, S. Petrov, and A. Sastry, "3D Tracking $=$ Classification $+$ Interpolation," Proc. Int'l Conf. Computer Vision, pp. 1441-1448, 2003.
[42] R. Urtasun and T. Darrell, "Sparse Probabilistic Regression for Activity Independent Human Pose Inference," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1-8, 2008.
[43] R. Urtasun, D. Fleet, and P. Fua, "3D People Tracking with Gaussian Process Dynamical Models," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2006.
[44] R. Urtasun, D. Fleet, and P. Fua, "Temporal Motion Models for Monocular and Multiview 3D Human Body Tracking," Computer Vision and Image Understanding, vol. 104, nos. 2/3, pp. 157-177, 2006.
[45] Q. Wang, G. Xu, and H. Ai, "Learning Object Intrinsic Structure for Robust Visual Tracking," Proc. IEEE Conf. Computer Vision and Pattern Recognition, June 2003.
[46] Y. Wu, G. Hua, and T. Yu, "Tracking Articulated Body by Dynamic Markov Network," Proc. Int'l Conf. Computer Vision, 2003.

Index Terms:
Computer vision, motion, video analysis, 3D scene analysis, modeling and recovery of physical attributes, tracking.
Citation:
Andrea Fossati, Miodrag Dimitrijevic, Vincent Lepetit, Pascal Fua, "From Canonical Poses to 3D Motion Capture Using a Single Camera," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 7, pp. 1165-1181, July 2010, doi:10.1109/TPAMI.2009.108
Usage of this product signifies your acceptance of the Terms of Use.