Issue No. 02 - February (2010 vol. 32)
John R. Hershey , IBM T. J. Watson Research Center, Yorktown Heights
Tim K. Marks , Mitsubishi Electric Research Laboratories, Cambridge
Javier R. Movellan , University of California San Diego, La Jolla
We present a generative model and inference algorithm for 3D nonrigid object tracking. The model, which we call G-flow, enables the joint inference of 3D position, orientation, and nonrigid deformations, as well as object texture and background texture. Optimal inference under G-flow reduces to a conditionally Gaussian stochastic filtering problem. The optimal solution to this problem reveals a new space of computer vision algorithms, of which classic approaches such as optic flow and template matching are special cases that are optimal only under special circumstances. We evaluate G-flow on the problem of tracking facial expressions and head motion in 3D from single-camera video. Previously, the lack of realistic video data with ground truth nonrigid position information has hampered the rigorous evaluation of nonrigid tracking. We introduce a practical method of obtaining such ground truth data and present a new face video data set that was created using this technique. Results on this data set show that G-flow is much more robust and accurate than current deterministic optic-flow-based approaches.
Computer vision, generative models, motion, shape, texture, video analysis, face tracking.
John R. Hershey, Tim K. Marks, Javier R. Movellan, "Tracking Motion, Deformation, and Texture Using Conditionally Gaussian Processes", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 32, no. , pp. 348-363, February 2010, doi:10.1109/TPAMI.2008.278