The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.05 - May (2008 vol.30)
pp: 878-892
ABSTRACT
This paper describes methods for recovering time-varying shape and motion of non-rigid 3D objects from uncalibrated 2D point tracks. For example, given a video recording of a talking person, we would like to estimate the 3D shape of the face at each instant, and learn a model of facial deformation. Time-varying shape is modeled as a rigid transformation combined with a non-rigid deformation. Reconstruction is ill-posed if arbitrary deformations are allowed, and thus additional assumptions about deformations are required. We first suggest restricting shapes to lie within a low-dimensional subspace, and describe estimation algorithms. However, this restriction alone is insufficient to constrain reconstruction. To address these problems, we propose a reconstruction method using a Probabilistic Principal Components Analysis (PPCA) shape model, and an estimation algorithm that simultaneously estimates 3D shape and motion for each instant, learns the PPCA model parameters, and robustly fills-in missing data points. We then extend the model to model temporal dynamics in object shape, allowing the algorithm to robustly handle severe cases of missing data.
INDEX TERMS
Motion, Shape, Machine learning, 3D/stereo scene analysis
CITATION
Lorenzo Torresani, Aaron Hertzmann, Chris Bregler, "Nonrigid Structure-from-Motion: Estimating Shape and Motion with Hierarchical Priors", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.30, no. 5, pp. 878-892, May 2008, doi:10.1109/TPAMI.2007.70752
REFERENCES
[1] J. Barbič and D. James, “Real-Time Subspace Integration for St. Venant-Kirchhoff Deformable Models,” ACM Trans. Graphics, vol. 24, no. 3, pp. 982-990, Aug. 2005.
[2] B. Bascle and A. Blake, “Separability of Pose and Expression in Facial Tracking Animation,” Proc. Int'l Conf. Computer Vision, pp.323-328, Jan. 1998.
[3] C.M. Bishop, “Variational Principal Components,” Proc. Int'l Conf. Artificial Neural Networks, vol. 1, pp. 509-514, 1999.
[4] V. Blanz and T. Vetter, “A Morphable Model for the Synthesis of 3D Faces,” Proc. ACM Int'l Conf. Computer Graphics and Interactive Techniques (SIGGRAPH '99), pp. 187-194, Aug. 1999.
[5] M. Brand, “Morphable 3D Models from Video,” Proc. Computer Vision and Pattern Recognition, vol. 2, pp. 456-463, 2001.
[6] M. Brand, “A Direct Method for 3D Factorization of Nonrigid Motion Observed in 2D,” Proc. Computer Vision and Pattern Recognition, vol. 2, pp. 122-128, 2005.
[7] C. Bregler, A. Hertzmann, and H. Biermann, “Recovering Non-Rigid 3D Shape from Image Streams,” Proc. Computer Vision and Pattern Recognition, pp. 690-696, 2000.
[8] A.M. Buchanan and A.W. Fitzgibbon, “Damped Newton Algorithms for Matrix Factorization with Missing Data,” Proc. Computer Vision and Pattern Recognition, vol. 2, pp. 316-322, 2005.
[9] T.F. Cootes and C.J. Taylor, “Statistical Models of Appearance for Medical Image Analysis and Computer Vision,” Proc. SPIE Medical Imaging, 2001.
[10] J.P. Costeira and T. Kanade, “A Multibody Factorization Method for Independently Moving Objects,” Int'l J. Computer Vision, vol. 29, no. 3, pp. 159-179, 1998.
[11] F. Dellaert, S.M. Seitz, C.E. Thorpe, and S. Thrun, “EM, MCMC, and Chain Flipping for Structure from Motion with Unknown Correspondence,” Machine Learning, vol. 50, nos. 1-2, pp. 45-71, 2003.
[12] A.P. Dempster, N.M. Laird, and D.B. Rubin, “Maximum Likelihood from Incomplete Data via the EM Algorithm,” J. Royal Statistical Soc. Series B, vol. 39, pp. 1-38, 1977.
[13] A. Gelman, J.B. Carlin, H.S. Stern, and D.B. Rubin, Bayesian Data Analysis, second ed. CRC Press, 2003.
[14] Z. Ghahramani and G.E. Hinton, “The EM Algorithm for Mixtures of Factor Analyzers,” Technical Report CRG-TR-96-1, Univ. of Toronto, 1996.
[15] Z. Ghahramani and G.E. Hinton, “Parameter Estimation for Linear Dynamical Systems,” Technical Report CRG-TR-96-2, Univ. of Toronto, 1996.
[16] M. Han and T. Kanade, “Multiple Motion Scene Reconstruction from Uncalibrated Views,” Proc. Int'l Conf. Computer Vision, vol. 1, pp. 163-170, July 2001.
[17] R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, second ed. Cambridge Univ. Press, 2003.
[18] G. Johansson, “Visual Perception of Biological Motion and a Model for Its Analysis,” Perception and Psychophysics, vol. 14, pp.201-211, 1973.
[19] D.J.C. MacKay, “Probable Networks and Plausible Predictions—A Review of Practical Bayesian Methods for Supervised Neural Networks,” Network: CNS, vol. 6, pp. 469-505, 1995.
[20] F.I. Parke, “Computer Generated Animation of Faces,” Proc. ACM Ann. Conf., pp. 451-457, 1972.
[21] V. Pavlović, J.M. Rehg, and J. MacCormick, “Learning Switching Linear Models of Human Motion,” Advances in Neural Information Processing Systems 13, pp. 981-987, 2001.
[22] S.T. Roweis, “EM Algorithms for PCA and SPCA,” Proc. Ann. Conf. Advances in Neural Information Processing Systems (NIPS '97), pp. 626-632, 1998.
[23] R.H. Shumway and D.S. Stoffer, “An Approach to Time Series Smoothing and Forecasting Using the EM Algorithm,” J. Time Series Analysis, vol. 3, no. 4, pp. 253-264, 1982.
[24] L. Sirovich and M. Kirby, “Low-Dimensional Procedure for the Characterization of Human Faces,” J. Optical Soc. Am. A, vol. 4, no. 3, pp. 519-524, Mar. 1987.
[25] M.E. Tipping and C.M. Bishop, “Probabilistic Principal Components Analysis,” J. Royal Statistical Soc. Series B, vol. 61, no. 3, pp.611-622, 1999.
[26] C. Tomasi and T. Kanade, “Shape and Motion from Image Streams Under Orthography: A Factorization Method,” Int'l J. Computer Vision, vol. 9, no. 2, pp. 137-154, 1992.
[27] L. Torresani and A. Hertzmann, “Automatic Non-Rigid 3D Modeling from Video,” Proc. European Conf. Computer Vision, pp.299-312, 2004.
[28] L. Torresani, A. Hertzmann, and C. Bregler, “Learning Non-Rigid 3D Shape from 2D Motion,” Proc. Ann. Conf. Advances in Neural Information Processing Systems (NIPS '04), pp. 1555-1562, 2004.
[29] L. Torresani, D. Yang, G. Alexander, and C. Bregler, “Tracking and Modeling Non-Rigid Objects with Rank Constraints,” Proc. Computer Vision and Pattern Recognition, pp. 493-500, 2001.
[30] N.F. Troje, “Decomposing Biological Motion: A Framework for Analysis and Synthesis of Human Gait Patterns,” J. Vision, vol. 2, no. 5, pp. 371-387, 2002.
[31] M. Turk and A. Pentland, “Eigenfaces for Recognition,” J.Cognitive Neuroscience, vol. 3, no. 1, pp. 71-86, 1991.
[32] S. Ullman, “Maximizing Rigidity: The Incremental Recovery of 3-D Structure from Rigid and Nonrigid Motion,” Perception, vol. 13, no. 3, pp. 255-274, 1984.
[33] J.M. Wang, D.J. Fleet, and A. Hertzmann, “Gaussian Process Dynamical Models,” Proc. Ann. Conf. Advances in Neural Information Processing Systems (NIPS '06), pp. 1441-1448, 2006.
[34] J. Xiao, J. Chai, and T. Kanade, “A Closed-Form Solution to Non-Rigid Shape and Motion Recovery,” Int'l J. Computer Vision, vol. 67, no. 2, pp. 233-246, 2006.
[35] A.J. Yezzi and S. Soatto, “Deformotion: Deforming Motion, Shape Averages, and the Joint Registration and Approximation of Structures in Images,” Int'l J. Computer Vision, vol. 53, pp. 153-167, 2003.
15 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool