This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Gaussian Process Dynamical Models for Human Motion
February 2008 (vol. 30 no. 2)
pp. 283-298
We introduce Gaussian process dynamical models (GPDM) for nonlinear time series analysis, with applications to learning models of human pose and motion from high-dimensionalmotion capture data. A GPDM is a latent variable model. It comprises a low-dimensional latent space with associated dynamics, and a map from the latent space to an observation space. We marginalize out the model parameters in closed-form, using Gaussian process priors for both the dynamics and the observation mappings. This results in a non-parametric model for dynamical systems that accounts for uncertainty in the model. We demonstrate the approach, and compare four learning algorithms on human motion capture data in which each pose is 50-dimensional. Despite the use of small data sets, the GPDM learns an effective representation of the nonlinear dynamics in these spaces.

[1] A. Elgammal and C.-S. Lee, “Inferring 3D Body Pose from Silhouettes Using Activity Manifold Learning,” Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 681-688, June/July 2004.
[2] N.R. Howe, M.E. Leventon, and W.T. Freeman, “Bayesian Reconstruction of 3D Human Motion from Single-Camera Video,” Advances in Neural Information Processing Systems 12—Proc. Ann. Conf. Neural Information Processing Systems, pp. 820-826, 2000.
[3] H. Sidenbladh, M.J. Black, and D.J. Fleet, “Stochastic Tracking of 3D Human Figures Using 2D Image Motion,” Proc. Sixth European Conf. Computer Vision, vol. 2, pp. 702-718, 2000.
[4] C. Sminchisescu and A.D. Jepson, “Generative Modeling for Continuous Non-Linearly Embedded Visual Inference,” Proc. 21st Int'l Conf. Machine Learning, pp. 759-766, July 2004.
[5] Y. Yacoob and M.J. Black, “Parameterized Modeling and Recognition of Activities,” Computer Vision and Image Understanding, vol. 73, no. 2, pp. 232-247, Feb. 1999.
[6] K. Grochow, S.L. Martin, A. Hertzmann, and Z. Popović, “Style-Based Inverse Kinematics,” Proc. ACM SIGGRAPH, vol. 23, no. 3, pp. 522-531, Aug. 2004.
[7] Y. Li, T. Wang, and H.-Y. Shum, “Motion Texture: A Two-Level Statistical Model for Character Motion Synthesis,” Proc. ACM SIGGRAPH, vol. 21, no. 3, pp. 465-472, July 2002.
[8] N.D. Lawrence, “Probabilistic Non-Linear Principal Component Analysis with Gaussian Process Latent Variable Models,” J.Machine Learning Research, vol. 6, pp. 1783-1816, Nov. 2005.
[9] A. Rahimi, B. Recht, and T. Darrell, “Learning Appearance Manifolds from Video,” Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 868-875, June 2005.
[10] R. Urtasun, D.J. Fleet, and P. Fua, “3D People Tracking with Gaussian Process Dynamical Models,” Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 238-245, June 2006.
[11] N.D. Lawrence, “The Gaussian Process Latent Variable Model,” Technical Report CS-06-03, Dept. Computer Science, Univ. of Sheffield, Jan. 2006.
[12] S.T. Roweis, “EM Algorithms for PCA and SPCA,” Advances in Neural Information Processing Systems 10—Proc. Ann. Conf. Neural Information Processing Systems, pp. 626-632, 1998.
[13] M.E. Tipping and C.M. Bishop, “Probabilistic Principal Component Analysis,” J. Royal Statistical Soc. B, vol. 61, no. 3, pp. 611-622, 1999.
[14] R. Bowden, “Learning Statistical Models of Human Motion,” Proc. IEEE Workshop Human Modeling, Analysis, and Synthesis, pp. 10-17, June 2000.
[15] M. Brand and A. Hertzmann, “Style Machines,” Proc. ACM SIGGRAPH, pp. 183-192, July 2000.
[16] L. Molina-Tanco and A. Hilton, “Realistic Synthesis of Novel Human Movements from a Database of Motion Capture Examples,” Proc. IEEE Workshop Human Motion, pp. 137-142, Dec. 2000.
[17] H. Murase and S. Nayar, “Visual Learning and Recognition of 3D Objects from Appearance,” Int'l J. Computer Vision, vol. 14, no. 1, pp. 5-24, Jan. 1995.
[18] S.T. Roweis and L.K. Saul, “Nonlinear Dimensionality Reduction by Locally Linear Embedding,” Science, vol. 290, pp. 2323-2326, Dec. 2000.
[19] M. Belkin and P. Niyogi, “Laplacian Eigenmaps for Dimensionality Reduction and Data Representation,” Neural Computation, vol. 15, no. 6, pp. 1373-1396, June 2003.
[20] J.B. Tenenbaum, V. de Silva, and J.C. Langford, “A Global Geometric Framework for Nonlinear Dimensionality Reduction,” Science, vol. 290, pp. 2319-2323, 2000.
[21] V. de Silva and J.B. Tenenbaum, “Global versus Local Methods in Nonlinear Dimensionality Reduction,” Advances in Neural Information Processing Systems 15—Proc. Ann. Conf. Neural Information Processing Systems, pp. 705-712, 2003.
[22] O.C. Jenkins and M.J. Matarić, “A Spatio-Temporal Extension to Isomap Nonlinear Dimension Reduction,” Proc. 21st Int'l Conf. Machine Learning, pp. 441-448, July 2004.
[23] R. Pless, “Image Spaces and Video Trajectories: Using Isomap to Explore Video Sequences,” Proc. Ninth IEEE Int'l Conf. Computer Vision, vol. 2, pp. 1433-1440, Oct. 2003.
[24] R. Urtasun, D.J. Fleet, A. Hertzmann, and P. Fua, “Priors for People Tracking from Small Training Sets,” Proc. 10th IEEE Int'l Conf. Computer Vision, vol. 1, pp. 403-410, Oct. 2005.
[25] N.D. Lawrence, “Learning for Larger Datasets with the Gaussian Process Latent Variable Model,” Proc. 11th Int'l Conf. Artificial Intelligence and Statistics, Mar. 2007.
[26] E. Snelson and Z. Ghahramani, “Sparse Gaussian Processes Using Pseudo-Inputs,” Advances in Neural Information Processing Systems 18—Proc. Ann. Conf. Neural Information Processing Systems, pp.1257-1264, 2006.
[27] A.P. Dempster, N.M. Laird, and D.B. Rubin, “Maximum Likelihood from Incomplete Data via the EM Algorithm,” J. Royal Statistical Soc. B, vol. 39, no. 1, pp. 1-38, 1977.
[28] Z. Ghahramani and G.E. Hinton, “Parameter Estimation for Linear Dynamical Systems,” Technical Report CRG-TR-96-2, Dept. Computer Science, Univ. of Toronto, Feb. 1996.
[29] R.H. Shumway and D.S. Stoffer, “An Approach to Time Series Smoothing and Forecasting Using the EM Algorithm,” J. Time Series Analysis, vol. 3, no. 4, pp. 253-264, 1982.
[30] P. Van Overschee and B. De Moor, “N4SID : Subspace Algorithms for the Identification of Combined Deterministic-Stochastic Systems,” Automatica, vol. 30, no. 1, pp. 75-93, Jan. 1994.
[31] G.A. Smith and A.J. Robinson, “A Comparison between the EM and Subspace Identification Algorithms for Time-Invariant Linear Dynamical Systems,” Technical Report CUED/F-INFENG/TR.345, Eng. Dept., Cambridge Univ., Nov. 2000.
[32] A. Bissacco, “Modeling and Learning Contact Dynamics in Human Motion,” Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 421-428, June 2005.
[33] S.M. Oh, J.M. Rehg, T.R. Balch, and F. Dellaert, “Learning and Inference in Parametric Switching Linear Dynamical Systems,” Proc. 10th IEEE Int'l Conf. Computer Vision, vol. 2, pp. 1161-1168, Oct. 2005.
[34] V. Pavlović, J.M. Rehg, and J. MacCormick, “Learning Switching Linear Models of Human Motion,” Advances in Neural Information Processing Systems 13—Proc. Ann. Conf. Neural Information Processing Systems, pp. 981-987, 2001.
[35] A.J. Ijspeert, J. Nakanishi, and S. Schaal, “Learning Attractor Landscapes for Learning Motor Primitives,” Advances in Neural Information Processing Systems 15—Proc. Ann. Conf. Neural Information Processing Systems, pp. 1523-1530, 2002.
[36] S.T. Roweis and Z. Ghahramani, “Learning Nonlinear Dynamical Systems Using the Expectation-Maximization Algorithm,” Kalman Filtering and Neural Networks, pp. 175-220, 2001.
[37] D. Ormoneit, H. Sidenbladh, M.J. Black, and T. Hastie, “Learning and Tracking Cyclic Human Motion,” Advances in Neural Information Processing Systems 13—Proc. Ann. Conf. Neural Information Processing Systems, pp. 894-900, 2001.
[38] R. Urtasun, D.J. Fleet, and P. Fua, “Temporal Motion Models for Monocular and Multiview 3D Human Body Tracking,” Computer Vision and Image Understanding, vol. 104, no. 2, pp. 157-177, Nov. 2006.
[39] H. Sidenbladh, M.J. Black, and L. Sigal, “Implicit Probabilistic Models of Human Motion for Synthesis and Tracking,” Proc. Seventh European Conf. Computer Vision, vol. 2, pp. 784-800, 2002.
[40] O. Arikan and D.A. Forsyth, “Interactive Motion Generation from Examples,” Proc. ACM SIGGRAPH, vol. 21, no. 3, pp. 483-490, July 2002.
[41] L. Kovar, M. Gleicher, and F. Pighin, “Motion Graphs,” Proc. ACM SIGGRAPH, vol. 21, no. 3, pp. 473-482, July 2002.
[42] J. Lee, J. Chai, P.S.A. Reitsma, J.K. Hodgins, and N.S. Pollard, “Interactive Control of Avatars Animated with Human Motion Data,” Proc. ACM SIGGRAPH, vol. 21, no. 3, pp. 491-500, July 2002.
[43] T. Mukai and S. Kuriyama, “Geostatistical Motion Interpolation,” Proc. ACM SIGGRAPH, vol. 24, no. 3, pp. 1062-1070, July 2005.
[44] C. Rose, M. Cohen, and B. Bodenheimer, “Verbs and Adverbs: Multidimensional Motion Interpolation,” IEEE Computer Graphics and Applications, vol. 18, no. 5, pp. 32-40, Sept./Oct. 1998.
[45] M.A. Giese and T. Poggio, “Morphable Models for the Analysis and Synthesis of Complex Motion Patterns,” Int'l J. Computer Vision, vol. 38, no. 1, pp. 59-73, June 2000.
[46] W. Ilg, G.H. Bakir, J. Mezger, and M. Giese, “On the Representation, Learning and Transfer of Spatio-Temporal Movement Characteristics,” Int'l J. Humanoid Robotics, vol. 1, no. 4, pp. 613-636, Dec. 2004.
[47] D.J.C. MacKay, Information Theory, Inference, and Learning Algorithms. Cambridge Univ. Press, 2003.
[48] R.M. Neal, Bayesian Learning for Neural Networks. Springer-Verlag, 1996.
[49] K. Moon and V. Pavlović, “Impact of Dynamics on Subspace Embedding and Tracking of Sequences,” Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 198-205, June 2006.
[50] R. Murray-Smith and B.A. Pearlmutter, “Transformations of Gaussian Process Priors,” Proc. Second Int'l Workshop Deterministic and Statistical Methods in Machine Learning, pp. 110-123, 2005.
[51] E. Solak, R. Murray-Smith, W.E. Leithead, D.J. Leith, and C.E. Rasmussen, “Derivative Observations in Gaussian Process Models of Dynamic Systems,” Advances in Neural Information Processing Systems 15—Proc. Ann. Conf. Neural Information Processing Systems, pp. 1033-1040, 2003.
[52] R. Urtasun, “Motion Models for Robust 3D Human Body Tracking,” PhD dissertation, École Polytechnique Fédérale de Lausanne (EPFL), 2006.
[53] A. Ogawa, K. Takeda, and F. Itakura, “Balancing Acoustic and Linguistic Probabilities,” Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing, vol. 1, pp. 181-184, May 1998.
[54] A. Rubio, J. Diaz-Verdejo, and J.S.P. Garcia, “On the Influence of Frame-Asynchronous Grammar Scoring in a CSR System,” Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing, vol. 1, pp.895-898, Apr. 1997.
[55] D.H. Brainard and W.T. Freeman, “Bayesian Color Constancy,” J.Optical Soc. Am. A, vol. 14, no. 7, pp. 1393-1411, July 1997.
[56] R.M. Neal and G.E. Hinton, “A View of the EM Algorithm that Justifies Incremental, Sparse, and Other Variants,” Learning in Graphical Models, MIT Press, pp. 355-368, 1999.
[57] G. Wei and M. Tanner, “A Monte Carlo Implementation of the EM Algorithm and the Poor Man's Data Augmentation Algorithms,” J. Am. Statistical Assoc., vol. 85, no. 411, pp. 699-704, 1990.
[58] A. Elgammal and C.-S. Lee, “Separating Style and Content on a Nonlinear Manifold,” Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 478-485, June/July 2004.
[59] J.Q. Shi, R. Murray-Smith, and D.M. Titterington, “Hierarchical Gaussian Process Mixtures for Regression,” Statistics and Computing, vol. 15, pp. 31-41, 2005.
[60] N.D. Lawrence and J.Q. Candela, “Local Distance Preservation in the GP-LVM through Back Constraints,” Proc. 23rd Int'l Conf. Machine Learning, pp. 513-520, June 2006.
[61] C.E. Rasmussen and M. Kuss, “Gaussian Processes in Reinforcement Learning,” Advances in Neural Information Processing Systems 16—Proc. Ann. Conf. Neural Information Processing Systems, pp. 751-759, 2004.
[62] B.S. Caffo, W. Jank, and G.L. Jones, “Ascent-Based Monte Carlo Expectation-Maximization,” J. Royal Statistical Soc. B, vol. 67, no. 2, pp. 235-251, Apr. 2005.
[63] J.M. Wang, D.J. Fleet, and A. Hertzmann, “Gaussian Process Dynamical Models,” Advances in Neural Information Processing Systems 18—Proc. Ann. Conf. Neural Information Processing Systems, pp. 1441-1448, 2006.

Index Terms:
machine learning, motion, tracking, animation, stochastic processes, time series analysis
Citation:
Jack M. Wang, David J. Fleet, Aaron Hertzmann, "Gaussian Process Dynamical Models for Human Motion," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30, no. 2, pp. 283-298, Feb. 2008, doi:10.1109/TPAMI.2007.1167
Usage of this product signifies your acceptance of the Terms of Use.