The Community for Technology Leaders
RSS Icon
Issue No.09 - September (2011 vol.33)
pp: 1793-1805
Martin de La Gorce , Laboratoire MAS, Ecole Centrale de Paris, Chatenay-Malabry
David J. Fleet , University of Toronto, Toronto
Nikos Paragios , Laboratoire MAS, Ecole Centrale de Paris, Chatenay-Malabry and INRIA Saclay - Ile-de-France, Orsay
A novel model-based approach to 3D hand tracking from monocular video is presented. The 3D hand pose, the hand texture, and the illuminant are dynamically estimated through minimization of an objective function. Derived from an inverse problem formulation, the objective function enables explicit use of temporal texture continuity and shading information while handling important self-occlusions and time-varying illumination. The minimization is done efficiently using a quasi-Newton method, for which we provide a rigorous derivation of the objective function gradient. Particular attention is given to terms related to the change of visibility near self-occlusion boundaries that are neglected in existing formulations. To this end, we introduce new occlusion forces and show that using all gradient terms greatly improves the performance of the method. Qualitative and quantitative experimental results demonstrate the potential of the approach.
Hand tracking, model based shape from shading, generative modeling, pose estimation, variational formulation, gradient descent.
Martin de La Gorce, David J. Fleet, Nikos Paragios, "Model-Based 3D Hand Pose Estimation from Monocular Video", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.33, no. 9, pp. 1793-1805, September 2011, doi:10.1109/TPAMI.2011.33
[1] S. Lu, D. Metaxas, D. Samaras, and J. Oliensis, "Using Multiple Cues for Hand Tracking and Model Refinement," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 443-450, 2003.
[2] A.O. Bălan, M. Black, H. Haussecker, and L. Sigal, "Shining a Light on Human Pose: On Shadows, Shading and the Estimation of Pose and Shape," Proc. IEEE Conf. Computer Vision, pp. 1-8, 2007.
[3] R. Rosales, V. Athitsos, L. Sigal, and S. Scarloff, "3D Hand Pose Reconstruction Using Specialized Mappings," Proc. IEEE Conf. Computer Vision, vol. 1, pp. 378-385, 2001.
[4] N. Shimada, "Real-Time 3-D Hand Posture Estimation Based on 2-D Appearance Retrieval Using Monocular Camera," Proc. IEEE ICCV Workshop Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems, pp. 23-30, 2001.
[5] V. Athitsos and S. Sclaroff, "Estimating 3D Hand Pose from a Cluttered Image," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, pp. 432-442, 2003.
[6] T.E. de Campos and D.W. Murray, "Regression-Based Hand Pose Estimation from Multiple Cameras," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 782-789, 2006.
[7] T. Heap and D. Hogg, "Towards 3D Hand Tracking Using a Deformable Model," Proc. Int'l Conf. Automatic Face and Gesture Recognition, pp. 140-145, 1996.
[8] J. Rehg and T. Kanade, "Model-Based Tracking of Self-Occluding Articulated Objects," Proc. IEEE Int'l Conf. Computer Vision, pp. 612-617, 1995.
[9] B. Stenger, P.R.S. Mendonça, and R. Cipolla, "Model-Based Hand Tracking Using an Unscented Kalman Filter," Proc. British Machine Vision Conf., vol. 1, pp. 63-72, 2001.
[10] M. de La Gorce and N. Paragios, "A Variational Approach to Monocular Hand-Pose Estimation," Computer Vision and Image Understanding, vol. 114, no. 3, pp. 363-372, 2010.
[11] E. Sudderth, M. Mandel, W. Freeman, and A. Willsky, "Visual Hand Tracking Using Nonparametric Belief Propagation," Proc. IEEE Conf. Computer Vision and Pattern Recognition Workshop, pp. 189-196, 2004.
[12] H. Ouhaddi and P. Horain, "3D Hand Gesture Tracking by Model Registration," Proc. Int'l Workshop Synthetic-Natural Hybrid Coding and 3D Imaging, pp. 70-73, 1999.
[13] Y. Wu, J.Y. Lin, and T.S. Huang, "Capturing Natural Hand Articulation," Proc. IEEE Int'l Conf. Computer Vision, pp. 426-432, 2001.
[14] B. Stenger, "Model-Based Hand Tracking Using a Hierarchical Bayesian Filter," PhD dissertation, Univ. of Cambridge, Mar. 2004.
[15] M. Brubaker, L. Sigal, and D. Fleet, "Video-Based People Tracking," Handbook of Ambient Intelligence and Smart Environments, Springer, 2009.
[16] A.O. Bălan, L. Sigal, M.J. Black, J.E. Davis, and H.W. Haussecker, "Detailed Human Shape and Pose from Images," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1-8, 2007.
[17] M. Bray, E. Koller-Meier, L. Van Gool, and N.N. Schraudolph, "3D Hand Tracking by Rapid Stochastic Gradient Descent Using a Skinning Model," Proc. European Conf. Visual Media Production, pp. 59-68, 2004.
[18] N. Magnenat-Thalmann, R. Laperriòre, and D. Thalmann, "Joint-Dependent Local Deformations for Hand Animation and Object Grasping," Proc. Graphics Interface '88, 1988.
[19] J.P. Lewis, M. Cordner, and N. Fong, "Pose Space Deformation: A Unified Approach to Shape Interpolation and Skeleton-Driven Deformation," Proc. ACM SIGGRAPH, pp. 165-172, 2000.
[20] V. Blanz and T. Vetter, "A Morphable Model for the Synthesis of 3D Faces," Proc. ACM SIGGRAPH, pp. 187-194, 1999.
[21] M. Soucy, G. Godin, and M. Rioux, "A Texture-Mapping Approach for the Compression of Colored 3D Triangulations," The Visual Computer, vol. 12, no. 10, pp. 503-514, 1996.
[22] C. Hernández, "Stereo and Silhouette Fusion for 3D Object Modeling from Uncalibrated Images under Circular Motion," PhD dissertation, Ecole Nationale Supérieure des Télécomm., May 2004.
[23] H. Sidenbladh, M.J. Black, and D.J. Fleet, "Stochastic Tracking of 3D Human Figures Using 2D Image Motion," Proc. European Conf. Computer Vision, vol. 2, pp. 702-718, 2000.
[24] N. Paragios and R. Deriche, "Geodesic Active Regions for Supervised Texture Segmentation," Proc. IEEE Int'l Conf. Computer Vision, vol. 2, pp. 926-932, 1999.
[25] G. Unal, A. Yezzi, and H. Krim, "Information-Theoretic Active Polygons for Unsupervised Texture Segmentation," Int'l J. Computer Vision, vol. 62, no. 3, pp. 199-220, 2005.
[26] P. Gargallo, E. Prados, and P. Sturm, "Minimizing the Reprojection Error in Surface Reconstruction from Images," Proc. IEEE Int'l Conf. Computer Vision, pp. 1-8, 2007.
[27] A. Delaunoy, E. Prados, P. Gargallo, J.-P. Pons, and P. Sturm, "Minimizing the Multi-View Stereo Reprojection Error for Triangular Surface Meshes," Proc. British Machine Vision Conf., 2008.
[28] P.V. Sander, H. Hoppe, J. Snyder, and S.J. Gortler, "Discontinuity Edge Overdraw," Proc. Symp. Interactive 3D Graphics, pp. 167-174, 2001.
[29] F.C. Crow, "A Comparison of Antialiasing Techniques," IEEE Computer Graphics and Applications, vol. 1, no. 1, pp. 40-48, Jan. 1981.
[30] L. Carpenter, "The A-Buffer, an Antialiased Hidden Surface Method," ACM SIGGRAPH Computer Graphics, vol. 18, no. 3, pp. 103-108, 1984.
[31] A. Griewank, Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation. SIAM, 2000.
[32] A. Conn, N. Gould, and P. Toint, Trust-Region Methods. SIAM, 2000.
[33] R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision. Cambridge Univ. Press, 2000.
[34] N. Komodakis and N. Paragios, "Beyond Pairwise Energies: Efficient Optimization for Higher-Order mrfs," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 2985-2992, 2009.
[35] C. Wang, M. de La Gorce, and N. Paragios, "Segmentation, Ordering and Multi-Object Tracking Using Graphical Models," Proc. IEEE Int'l Conf. Computer Vision, pp. 747-754, 2009.
29 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool