Subscribe
Issue No.04 - April (2011 vol.33)
pp: 780-793
Yaser Sheikh , Carnegie Mellon University, Pittsburgh
Ankur Datta , Carnegie Mellon University, Pittsburgh
ABSTRACT
In this paper, we describe the explicit application of articulation constraints for estimating the motion of a system of articulated planes. We relate articulations to the relative homography between planes and show that these articulations translate into linearized equality constraints on a linear least-squares system, which can be solved efficiently using a Karush-Kuhn-Tucker system. The articulation constraints can be applied for both gradient-based and feature-based motion estimation algorithms and to illustrate this, we describe a gradient-based motion estimation algorithm for an affine camera and a feature-based motion estimation algorithm for a projective camera that explicitly enforces articulation constraints. We show that explicit application of articulation constraints leads to numerically stable estimates of motion. The simultaneous computation of motion estimates for all of the articulated planes in a scene allows us to handle scene areas where there is limited texture information and areas that leave the field of view. Our results demonstrate the wide applicability of the algorithm in a variety of challenging real-world cases such as human body tracking, motion estimation of rigid, piecewise planar scenes, and motion estimation of triangulated meshes.
INDEX TERMS
Registration, motion, tracking.
CITATION
Yaser Sheikh, Ankur Datta, "Linearized Motion Estimation for Articulated Planes", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.33, no. 4, pp. 780-793, April 2011, doi:10.1109/TPAMI.2010.134
REFERENCES
[1] B.D. Lucas and T. Kanade, "An Iterative Image Registration Technique with an Application to Stereo Vision," Proc. Image Understanding Workshop, 1981.
[2] J.R. Bergen, P. Anandan, K.J. Hanna, and R. Hingorani, "Hierarchical Model-Based Motion Estimation," Proc. Second European Conf. Computer Vision, 1992.
[3] L. Sigal, S. Bhatia, S. Roth, M.J. Black, and M. Isard, "Tracking Loose-Limbed People," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2004.
[4] M. Black and A. Jepson, "Eigentracking: Robust Matching and Tracking of Articulated Objects Using a View-Based Representation," Int'l J. Computer Vision, vol. 36, no. 1, pp. 63-84, 1998.
[5] T.F. Cootes, G.J. Edwards, and C.J. Taylor, "Active Appearance Models," Proc. European Conf. Computer Vision, 1998.
[6] Y. Weiss, "Smoothness in Layers: Motion Segmentation Using Nonparametric Mixture Estimation," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 1997.
[7] V.G. Bellile, M. Perriollat, A. Bartoli, and P. Sayd, "Image Registration by Combining Thin-Plate Splines with a 3D Morphable Model," Proc. Int'l Conf. Image Processing, 2006.
[8] J. Lim and M.H. Yang, "A Direct Method for Modeling Non-Rigid Motion with Thin Plate Spline," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2005.
[9] S. Ju, M. Black, and Y. Yacoob, "Cardboard People: A Parameterized Model of Articulated Image Motion," Proc. Int'l Conf. Automatic Face and Gesture Recognition, 1996.
[10] J.Y.A. Wang and E.H. Adelson, "Representing Moving Images with Layers," IEEE Trans. Image Processing, vol. 3, no. 5, pp. 625-638, Sept. 1994.
[11] H.S. Sawhney and S. Ayer, "Compact Representations of Videos through Dominant and Multiple Motion Estimation," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 8, pp. 814-830, Aug. 1996.
[12] L. Zelnik-Manor and M. Irani, "Multiview Constraints on Homographies," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 2, pp. 214-223, Feb. 2002.
[13] H. Nagel, "Displacement Vectors Derived from Second-Order Intensity Variations in Image Sequences," Graphical Model and Image Processing, vol. 21, no. 1, pp. 85-117, Jan. 1983.
[14] C. Fennema and W. Thompson, "Velocity Determination in Scenes Containing Several Moving Objects," Graphical Model and Image Processing, vol. 9, no. 4, pp. 301-315, Apr. 1979.
[15] B. Schunck and B. Horn, "Determining Optical Flow," MIT AI Memo, 1980.
[16] S. Uras, F. Girosi, A. Verri, and V. Torre, "A Computational Approach to Motion Perception," BioCyber, vol. 60, pp. 79-87, 1989.
[17] F. Glazer, G. Reynold, and P. Anandan, "Scene Matching through Hierarchical Correlation," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 432-441, 1983.
[18] P. Anandan, "A Unified Perspective on Computational Techniques for the Measurement of Visual Motion," Proc. IEEE Int'l Conf. Computer Vision, pp. 219-230, 1987.
[19] P.J. Burt, C. Yen, and X. Xu, "Multiresolution Flow through Motion Analysis," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 1983.
[20] J.J. Little, H.H. Bulthoff, and T.A. Poggio, "Parallel Optical Flow Using Local Voting," Proc. IEEE Int'l Conf. Computer Vision, pp. 454-459, 1988.
[21] S.S. Beauchemin and J.L. Barron, "The Computation of Optical Flow," ACM Computing Surveys, vol. 27, no. 3, pp. 433-466, 1995.
[22] D.J. Fleet and Y. Weiss, "Optical Flow Estimation," Handbook of Math. Models in Computer Vision, Springer, 2006.
[23] S. Ayer and H. Sawhney, "Layered Representation of Motion Video Using Robust Maximum-Likelihood Estimation of Mixture Models and MDL Encoding," Proc. IEEE Int'l Conf. Computer Vision, 1995.
[24] Y. Weiss, "Smoothness in Layers: Motion Segmentation Using Nonparametric Mixture Estimation," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 520-526, 1997.
[25] P. Anandan, R. Szeliski, and P. Torr, "An Integrated Bayesian Approach to Layer Extraction from Image Sequences," Proc. IEEE Int'l Conf. Computer Vision, pp. 983-990, 1999.
[26] R. Szeliski, S. Avidan, and P. Anandan, "Layer Extraction from Multiple Images Containing Reflections Transparency," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 246-253, 2000.
[27] Q. Ke and T. Kanade, "A Subspace Approach to Layer Extraction," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2001.
[28] S. Ju, M. Black, and A. Jepson, "Skin and Bones: Multi-Layer, Locally Affine, Optical Flow and Regularization with Transparency," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 1996.
[29] N.R. Howe, M.E. Leventon, and W.T. Freeman, "Bayesian Reconstruction of 3D Human Motion from Single-Camera Video," Advances in Neural Information Processing Systems, MIT Press, 1999.
[30] I. Haritaoglu, D. Harwood, and L.S. Davis, "W4s: A Real-Time System Detecting and Tracking People in 2 1/2D," Proc. European Conf. Computer Vision, pp. 877-892, 1998.
[31] Y. Huang and T.S. Huang, "Model-Based Human Body Tracking," Proc. Int'l Conf. Pattern Recognition, 2002.
[32] T. jen Cham and J.M. Rehg, "A Multiple Hypothesis Approach to Figure Tracking," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 1999.
[33] A. Agarwal and B. Triggs, "Tracking Articulated Motion Using a Mixture of Autoregressive Models," Proc. European Conf. Computer Vision, pp. 54-65, 2004.
[34] C. Bregler, J. Malik, and K. Pullen, "Twist Based Acquisition and Tracking of Animal and Human Kinematics," Int'l J. Computer Vision, vol. 56, no. 3, pp. 179-194, 2004.
[35] C. Bregler and J. Malik, "Tracking People with Twists and Exponential Maps," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 1998.
[36] J.M. Rehg and T. Kanade, "Model-Based Tracking of Self-Occluding Articulated Objects," Proc. IEEE Int'l Conf. Computer Vision, pp. 612-617, 1995.
[37] D.M. Gavrila and L.S. Davis, "3d Model-Based Tracking of Humans in Action: A Multi-View Approach," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 1996.
[38] Y. Yacoob and L.S. Davis, "Learned Models for Estimation of Rigid and Articulated Human Motion from Stationary or Moving Camera," Int'l J. Computer Vision, vol. 36, no. 1, pp. 5-30, 2000.
[39] I.A. Kakadiaris and D.N. Metaxas, "Model-Based Estimation of 3d Human Motion," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 12, pp. 1453-1459, Dec. 2000.
[40] M. Yamamoto and K. Yagishita, "Scene Constraints-Aided Tracking of Human Body," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2000.
[41] A. Ruf and R. Horaud, "Rigid and Articulated Motion Seen with an Uncalibrated Stereo Rig," Proc. IEEE Int'l Conf. Computer Vision, 1999.
[42] L. Sigal and M.J. Black, "Measure Locally, Reason Globally: Occlusion-Sensitive Articulated Pose Estimation," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2006.
[43] D. Demirdjian, T. Ko, and T. Darrell, "Constraining Human Body Tracking," Proc. IEEE Int'l Conf. Computer Vision, 2003.
[44] F.L. Bookstein, "Principal Warps: Thin-Plate Splines and the Decomposition of Deformations," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 11, no. 6, pp. 567-585, June 1989.
[45] M.J. Black and Y. Yacoob, "Tracking and Recognizing Rigid and Non-Rigid Facial Motions Using Local Parametric Models of Image Motion," Proc. IEEE Int'l Conf. Computer Vision, 1995.
[46] S. Sclaroff and J. Isidoro, "Active Blobs," Proc. IEEE Int'l Conf. Computer Vision, 1998.
[47] A. Bartoli and A. Zisserman, "Direct Estimation of Non-Rigid Registration," Proc. Ann. British Machine Vision Conf., 2004.
[48] V.G. Bellile, A. Bartoli, and P. Sayd, "Feature-Driven Direct Non-Rigid Image Registration," Proc. Ann. British Machine Vision Conf., 2007.
[49] T.F. Cootes, S. Marsl, C.J. Twining, K. Smith, and C.J. Taylor, "Groupwise Diffeomorphic Non-Rigid Registration for Automatic Model Building," Proc. European Conf. Computer Vision, pp. 316-327, 2004.
[50] M. Kass, A. Witkin, and D. Terzopoulos, "Snakes: Active Contour Models," Int'l J. Computer Vision, vol. V1, no. 4, pp. 321-331, Jan. 1988.
[51] T. Mcinerney and D. Terzopoulos, "A Finite Element Model for 3d Shape Reconstruction and Nonrigid Motion Tracking," Proc. IEEE Int'l Conf. Computer Vision, pp. 518-523, 1993.
[52] D. Metaxas and D. Terzopoulos, "Shape and Nonrigid Motion Estimation through Physics-Based Synthesis," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, no. 6, pp. 580-591, June 1993.
[53] A.P. Pentland, "Automatic Extraction of Deformable Part Models," Int'l J. Computer Vision, vol. 4, no. 2, pp. 107-126, 1990.
[54] L.D. Cohen and I. Cohen, "Finite-Element Methods for Active Contour Models and Balloons for 2d and 3d Images," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, no. 11, pp. 1131-1147, Nov. 1993.
[55] H. Delingette, M. Hebert, and K. Ikeuchi, "Deformable Surfaces: A Free-Form Shape Representation," Proc. SPIE, pp. 21-30, 1991.
[56] X. Llado, A.D. Bue, and L. Agapito, "Non-Rigid 3d Factorization for Projective Reconstruction," Proc. Ann. British Machine Vision Conf., Sept. 2005.
[57] L.T. Stanford, A. Hertzmann, and C. Bregler, "Learning Non-Rigid 3d Shape from 2d Motion," Advances in Neural Information Processing Systems, pp. 1555-1562, MIT Press, 2003.
[58] M. Salzmann, R. Urtasun, and P. Fua, "Local Deformation Models for Monocular 3D Shape Recovery," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[59] H.J. Lee and Z. Chen, "Determination of 3D Human Body Postures from a Single View," Graphical Model and Image Processing, vol. 30, pp. 148-168 , 1985.
[60] L. Van Gool, L. Proesmans, and A. Zisserman, "Grouping and Invariants Using Planar Homologies," Proc. Workshop Geometric Modeling and Invariants for Computer Vision, 1995.
[61] B. Johansson, "View Synthesis and 3D Reconstruction of Piecewise Planar Scenes Using Intersection Lines between the Planes," Proc. IEEE Int'l Conf. Computer Vision, 1999.
[62] P. Pritchett and A. Zisserman, "Matching and Reconstruction from Widely Separated Views," 3D Structure from Multiple Images of Large-Scale Environments, Springer, 1998.
[63] J. Semple and G. Kneebone, Algebraic Projective Geometry. Oxford Univ. Press, 1952.
[64] P. Gill, W. Murray, and M. Wright, Practical Optimization. Academic Press, 1981.
[65] R.I. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, second ed. Cambridge Univ. Press, 2004.
[66] Y.A. Sheikh, A. Datta, and T. Kanade, "On the Sustained Tracking of Human Motion," Proc. Eighth IEEE Int'l Conf. Automatic Face and Gesture Recognition, Sept. 2008.
[67] D. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints," Int'l J. Computer Vision, vol. 20, 2003.