2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2010)
San Francisco, CA, USA
June 13, 2010 to June 18, 2010
Graham W. Taylor , New York University, New York, USA
Leonid Sigal , Disney Research Pittsburgh, USA
David J. Fleet , University of Toronto, Toronto, Canada
Geoffrey E. Hinton , University of Toronto, Toronto, Canada
We introduce a new class of probabilistic latent variable model called the Implicit Mixture of Conditional Restricted Boltzmann Machines (imCRBM) for use in human pose tracking. Key properties of the imCRBM are as follows: (1) learning is linear in the number of training exemplars so it can be learned from large datasets; (2) it learns coherent models of multiple activities; (3) it automatically discovers atomic “movemes” and (4) it can infer transitions between activities, even when such transitions are not present in the training set. We describe the model and how it is learned and we demonstrate its use in the context of Bayesian filtering for multi-view and monocular pose tracking. The model handles difficult scenarios including multiple activities and transitions among activities. We report state-of-the-art results on the HumanEva dataset.
G. E. Hinton, D. J. Fleet, L. Sigal and G. W. Taylor, "Dynamical binary latent variable models for 3D human pose tracking," 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), San Francisco, CA, USA, 2010, pp. 631-638.