The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.11 - Nov. (2013 vol.35)
pp: 2720-2735
Yebin Liu , Autom. Dept., Tsinghua Univ., Beijing, China
J. Gall , Perceiving Syst. Dept., Max Planck Inst. for Intell. Syst., Tubingen, Germany
C. Stoll , Max Planck Inst. for Inf., Saarland Univ., Saarbrucken, Germany
Qionghai Dai , Autom. Dept., Tsinghua Univ., Beijing, China
Hans-Peter Seidel , Max Planck Inst. for Inf., Saarland Univ., Saarbrucken, Germany
C. Theobalt , Max Planck Inst. for Inf., Saarland Univ., Saarbrucken, Germany
ABSTRACT
Capturing the skeleton motion and detailed time-varying surface geometry of multiple, closely interacting peoples is a very challenging task, even in a multicamera setup, due to frequent occlusions and ambiguities in feature-to-person assignments. To address this task, we propose a framework that exploits multiview image segmentation. To this end, a probabilistic shape and appearance model is employed to segment the input images and to assign each pixel uniquely to one person. Given the articulated template models of each person and the labeled pixels, a combined optimization scheme, which splits the skeleton pose optimization problem into a local one and a lower dimensional global one, is applied one by one to each individual, followed with surface estimation to capture detailed nonrigid deformations. We show on various sequences that our approach can capture the 3D motion of humans accurately even if they move rapidly, if they wear wide apparel, and if they are engaged in challenging multiperson motions, including dancing, wrestling, and hugging.
INDEX TERMS
Optimization, Estimation, Image segmentation, Shape, Joints, Humans,image segmentation, Markerless motion capture, multiview video, multiple characters
CITATION
Yebin Liu, J. Gall, C. Stoll, Qionghai Dai, Hans-Peter Seidel, C. Theobalt, "Markerless Motion Capture of Multiple Characters Using Multiview Image Segmentation", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.35, no. 11, pp. 2720-2735, Nov. 2013, doi:10.1109/TPAMI.2013.47
REFERENCES
[1] L. Sigal, A. Balan, and M. Black, "HumanEva: Synchronized Video and Motion Capture Dataset and Baseline Algorithm for Evaluation of Articulated Human Motion," Int'l J. Computer Vision, vol. 87, pp. 4-27, 2010.
[2] Visual Analysis of Humans—Looking at People, T.B. Moeslund, A. Hilton, V. Krüger, and L. Sigal, eds. Springer, 2011.
[3] D. Gavrila and L. Davis, "3-D Model-Based Tracking of Humans in Action: A Multi-View Approach," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 73-80, 1996.
[4] C. Bregler, J. Malik, and K. Pullen, "Twist Based Acquisition and Tracking of Animal and Human Kinematics," Int'l J. Computer Vision, vol. 56, no. 3, pp. 179-194, 2004.
[5] J. Deutscher and I. Reid, "Articulated Body Motion Capture by Stochastic Search," Int'l J. Computer Vision, vol. 61, no. 2, pp. 185-205, 2005.
[6] J. Gall, B. Rosenhahn, T. Brox, and H.-P. Seidel, "Optimization and Filtering for Human Motion Capture—A Multi-Layer Framework," Int'l J. Computer Vision, vol. 87, no. 1, pp. 75-92, 2010.
[7] L. Ballan and G. Cortelazzo, "Marker-Less Motion Capture of Skinned Models in a Four Camera Set-Up Using Optical Flow and Silhouettes," Proc. Int'l Conf. 3D Data Processing, Visualization and Transmission, 2008.
[8] R.P. Horaud, M. Niskanen, G. Dewaele, and E. Boyer, "Human Motion Tracking by Registering an Articulated Surface to 3-D Points and Normals," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 1, pp. 158-164, Jan. 2009.
[9] S. Corazza, L. Mündermann, E. Gambaretto, G. Ferrigno, and T.P. Andriacchi, "Markerless Motion Capture through Visual Hull, Articulated ICP and Subject Specific Model Generation," Int'l J. Computer Vision, vol. 87, nos. 1/2, pp. 156-169, 2010.
[10] R. Li, T.-P. Tian, S. Sclaroff, and M.-H. Yang, "3D Human Motion Tracking with a Coordinated Mixture of Factor Analyzers," Int'l J. Computer Vision, vol. 87, nos. 1/2, pp. 170-190, 2010.
[11] L. Bo and C. Sminchisescu, "Twin Gaussian Processes for Structured Prediction," Int'l J. Computer Vision, vol. 87, nos. 1/2, pp. 28-52, 2010.
[12] R. Girshick, J. Shotton, P. Kohli, A. Criminisi, and A. Fitzgibbon, "Efficient Regression of General-Activity Human Poses from Depth Images," Proc. IEEE Int'l Conf. Computer Vision, pp. 415-422, 2011.
[13] J. Starck and A. Hilton, "Model-Based Multiple View Reconstruction of People," Proc. IEEE Int'l Conf. Computer Vision, pp. 915-922, 2003.
[14] K. Varanasi, A. Zaharescu, E. Boyer, and R. Horaud, "Temporal Surface Tracking Using Mesh Evolution," Proc. European Conf. Computer Vision, pp. 30-43, 2008.
[15] E. de Aguiar, C. Stoll, C. Theobalt, N. Ahmed, H.-P. Seidel, and S. Thrun, "Performance Capture from Sparse Multi-View Video," ACM Trans. Graphics, vol. 27, article 98, 2008.
[16] C. Cagniart, E. Boyer, and S. Ilic, "Probabilistic Deformable Surface Tracking from Multiple Videos," Proc. European Conf. Computer Vision, 2010.
[17] D. Vlasic, I. Baran, W. Matusik, and J. Popović, "Articulated Mesh Animation from Multi-View Silhouettes," ACM Trans. Graphics, vol. 27, no. 3, pp. 1-9, 2008.
[18] C. Stoll, J. Gall, E. de Aguiar, S. Thrun, and C. Theobalt, "Video-Based Reconstruction of Animatable Human Characters," ACM Trans. Graphics, vol. 29, no. 6,article 139, 2010.
[19] H. Egashira, A. Shimada, D. Arita, and R. Taniguchi, "Vision-Based Motion Capture of Interacting Multiple People," Proc. Int'l Conf. Image Analysis and Processing, pp. 451-460, 2009.
[20] C. Cagniart, E. Boyer, and S. Ilic, "Free-Form Mesh Tracking: A Patch-Based Approach," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010.
[21] J. Gall, C. Stoll, E. Aguiar, C. Theobalt, B. Rosenhahn, and H.-P. Seidel, "Motion Capture Using Joint Skeleton Tracking and Surface Estimation," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1746-1753, 2009.
[22] Y. Liu, C. Stoll, J. Gall, H.-P. Seidel, and C. Theobalt, "Markerless Motion Capture of Interacting Characters Using Multi-View Image Segmentation," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1249-1256, 2011.
[23] A. Fossati, J. Gall, H. Grabner, X. Ren, and K. Konolige, Consumer Depth Cameras for Computer Vision—Research Topics and Applications. Springer, 2012.
[24] J. Shotton, R. Girshick, A. Fitzgibbon, T. Sharp, M. Cook, M. Finocchio, R. Moore, P. Kohli, A. Criminisi, A. Kipman, and A. Blake, "Efficient Human Pose Estimation from Single Depth Images," IEEE Trans. Pattern Analysis and Machine Intelligence, doi: 10.1109/TPAMI.2012.241, 2012.
[25] R. Kehl, M. Bray, and L. van Gool, "Full Body Tracking from Multiple Views Using Stochastic Sampling," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 129-136, 2005.
[26] T. Drummond and R. Cipolla, "Real-Time Tracking of Highly Articulated Structures in the Presence of Noisy Measurements," Proc. IEEE Int'l Conf. Computer Vision, pp. 315-320, 2001.
[27] C. Sminchisescu and B. Triggs, "Estimating Articulated Human Motion with Covariance Scaled Sampling," Int'l J. Robotics Research, vol. 22, no. 6, pp. 371-391, 2003.
[28] S. Corazza, L. Mündermann, A. Chaudhari, T. Demattio, C. Cobelli, and T. Andriacchi, "A Markerless Motion Capture System to Study Musculoskeletal Biomechanics: Visual Hull and Simulated Annealing Approach," Annals Biomedical Eng., vol. 34, no. 6, pp. 1019-1029, 2006.
[29] R. Plankers and P. Fua, "Articulated Soft Objects for Multiview Shape and Motion Capture," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 9, pp. 1182-1187, Sept. 2003.
[30] G. Cheung, S. Baker, and T. Kanade, "Shape-from-Silhouette across Time Part II: Applications to Human Modeling and Markerless Motion Tracking," Int'l J. Computer Vision, vol. 63, no. 3, pp. 225-245, 2005.
[31] D. Anguelov, P. Srinivasan, D. Koller, S. Thrun, J. Rodgers, and J. Davis, "Scape: Shape Completion and Animation of People," ACM Trans. Graphics, vol. 24, no. 3, pp. 408-416, 2005.
[32] A. Balan, L. Sigal, M. Black, J. Davis, and H. Haussecker, "Detailed Human Shape and Pose from Images," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2007.
[33] A. Balan and M. Black, "The Naked Truth: Estimating Body Shape under Clothing," Proc. European Conf. Computer Vision, pp. 15-29, 2008.
[34] B. Rosenhahn, U. Kersting, K. Powell, R. Klette, G. Klette, and H.-P. Seidel, "A System for Articulated Tracking Incorporating a Clothing Model," Machine Vision Applications, vol. 18, no. 1, pp. 25-40, 2007.
[35] E. de Aguiar, C. Theobalt, C. Stoll, and H.-P. Seidel, "Marker-Less Deformable Mesh Tracking for Human Shape and Motion Capture," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1-8, 2007.
[36] M. Straka, S. Hauswiesner, M. Rüther, and H. Bischof, "Simultaneous Shape and Pose Adaption of Articulated Models Using Linear Optimization," Proc. European Conf. Computer Vision, pp. 724-737, 2012.
[37] G. Ye, Y. Liu, N. Hasler, X. Ji, Q. Dai, and C. Theobalt, "Performance Capture of Interacting Characters with Handheld Kinects," Proc. European Conf. Computer Vision, pp. 828-841, 2012.
[38] J.-Y. Guillemaut, J. Kilner, and A. Hilton, "Robust Graph-Cut Scene Segmentation and Reconstruction for Free-Viewpoint Video of Complex Dynamic Scenes," Proc. IEEE Int'l Conf. Computer Vision, pp. 809-816, 2009.
[39] A.M. Elgammal and L.S. Davis, "Probabilistic Framework for Segmenting People under Occlusion," Proc. IEEE Int'l Conf. Computer Vision, pp. 145-152, 2001.
[40] S.M. Khan and M. Shah, "Tracking Multiple Occluding People by Localizing on Multiple Scene Planes," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 3, pp. 505-519, Mar. 2009.
[41] F. Fleuret, J. Berclaz, R. Lengagne, and P. Fua, "Multicamera People Tracking with a Probabilistic Occupancy Map," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 2, pp. 267-282, Feb. 2008.
[42] K. Kim and L.S. Davis, "Multi-Camera Tracking and Segmentation of Occluded People on Ground Plane Using Search-Guided Particle Filtering," Proc. European Conf. Computer Vision, pp. 98-109, 2006.
[43] M. Andriluka, S. Roth, and B. Schiele, "Monocular 3D Pose Estimation and Tracking by Detection," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010.
[44] S. Gammeter, A. Ess, T. Jaeggli, K. Schindler, B. Leibe, and L. van Gool, "Articulated Multibody Tracking under Egomotion," Proc. European Conf. Computer Vision, 2008.
[45] Q. Zhang and K.N. Ngan, "Segmentation and Tracking Multiple Objects under Occlusion from Multiview Video," IEEE Trans. Image Processing, vol. 20, no. 11, pp. 3308-3313, Nov. 2011.
[46] P. Kohli, J. Rihan, M. Bray, and P. Torr, "Simultaneous Segmentation and Pose Estimation of Humans Using Dynamic Graph Cuts," Int'l J. Computer Vision, vol. 79, pp. 285-298, 2008.
[47] T. Brox, B. Rosenhahn, J. Gall, and D. Cremers, "Combined Region- and Motion-Based 3D Tracking of Rigid and Articulated Objects," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 32, no. 3, pp. 402-415, Mar. 2010.
[48] J. Gall, B. Rosenhahn, and H.-P. Seidel, "Drift-Free Tracking of Rigid and Articulated Objects," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[49] R.M. Murray, S.S. Sastry, and L. Zexiang, A Mathematical Introduction to Robotic Manipulation. CRC Press, Inc., 1994.
[50] L. Kavan, S. Collins, J. Žára, and C. O'Sullivan, "Skinning with Dual Quaternions," Proc. Symp. Interactive 3D Graphics and Games, pp. 39-46, 2007.
[51] I. Baran and J. Popović, "Automatic Rigging and Animation of 3D Characters," ACM Trans. Graphics, vol. 26, no. 3,article 72, 2007.
[52] D. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints," Int'l J. Computer Vision, vol. 60, no. 2, pp. 91-110, 2004.
[53] J. Stolfi, Oriented Projective Geometry: A Framework for Geometric Computation. Academic Press, 1991.
[54] J. Gall, J. Potthoff, C. Schnoerr, B. Rosenhahn, and H.-P. Seidel, "Interacting and Annealing Particle Filters: Mathematics and a Recipe for Applications," J. Math. Imaging and Vision, vol. 28, no. 1, pp. 1-18, 2007.
[55] M. Botsch and O. Sorkine, "On Linear Variational Surface Deformation Methods," IEEE Trans. Visualization and Computer Graphics, vol. 14, no. 1, pp. 213-230, Jan./Feb. 2008.
[56] R. Szeliski, R. Zabih, D. Scharstein, O. Veksler, V. Kolmogorov, A. Agarwala, M. Tappen, and C. Rother, "A Comparative Study of Energy Minimization Methods for Markov Random Fields with Smoothness-Based Priors," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 6, pp. 1068-1080, June 2008.
[57] Y. Boykov and M. Jolly, "Iterative Graph Cuts for Optimal Boundary and Region Segmentation of Objects in n-d Images," Proc. IEEE Int'l Conf. Computer Vision, pp. 105-112, 2001.
[58] Y. Boykov, O. Veksler, and R. Zabih, "Markov Random Fields with Efficient Approximations," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 648-655, 1998.
[59] J. Hammersley and D. Handscomb, Monte Carlo Methods. Methuen, 1964.
[60] C. Stoll, N. Hasler, J. Gall, H.-P. Seidel, and C. Theobalt, "Fast Articulated Motion Tracking Using a Sums of Gaussians Body Model," Proc. IEEE Int'l Conf. Computer Vision, pp. 951-958, 2011.
[61] N. Hasler, C. Stoll, M. Sunkel, B. Rosenhahn, and H.-P. Seidel, "A Statistical Model of Human Pose and Body Shape," Computer Graphics Forum, vol. 2, no. 28, 2009.
[62] M. Shaheen, J. Gall, R. Strzodka, L. Van Gool, and H.-P. Seidel, "A Comparison of 3D Model-Based Tracking Approaches for Human Motion Capture in Uncontrolled Environments," Proc. Workshop Applications of Computer Vision, 2009.
9 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool