The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.09 - September (2011 vol.33)
pp: 1844-1859
Eng-Jon Ong , University of Surrey, Guildford
Richard Bowden , University of Surrey, Guildford
ABSTRACT
This paper proposes a learned data-driven approach for accurate, real-time tracking of facial features using only intensity information. The task of automatic facial feature tracking is nontrivial since the face is a highly deformable object with large textural variations and motion in certain regions. Existing works attempt to address these problems by either limiting themselves to tracking feature points with strong and unique visual cues (e.g., mouth and eye corners) or by incorporating a priori information that needs to be manually designed (e.g., selecting points for a shape model). The framework proposed here largely avoids the need for such restrictions by automatically identifying the optimal visual support required for tracking a single facial feature point. This automatic identification of the visual context required for tracking allows the proposed method to potentially track any point on the face. Tracking is achieved via linear predictors which provide a fast and effective method for mapping pixel intensities into tracked feature position displacements. Building upon the simplicity and strengths of linear predictors, a more robust biased linear predictor is introduced. Multiple linear predictors are then grouped into a rigid flock to further increase robustness. To improve tracking accuracy, a novel probabilistic selection method is used to identify relevant visual areas for tracking a feature point. These selected flocks are then combined into a hierarchical multiresolution LP model. Finally, we also exploit a simple shape constraint for correcting the occasional tracking failure of a minority of feature points. Experimental results show that this method performs more robustly and accurately than AAMs, with minimal training examples on example sequences that range from SD quality to Youtube quality. Additionally, an analysis of the visual support consistency across different subjects is also provided.
INDEX TERMS
Facial feature tracking, learning, linear predictors, multiple resolution, probabilistic selection.
CITATION
Eng-Jon Ong, Richard Bowden, "Robust Facial Feature Tracking Using Shape-Constrained Multiresolution-Selected Linear Predictors", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.33, no. 9, pp. 1844-1859, September 2011, doi:10.1109/TPAMI.2010.205
REFERENCES
[1] K. Atul, Y. Huang, and D. Metaxas, "Tracking Facial Features Using Mixture of Point Distribution Models," Proc. Indian Conf. Computer Vision Graphics and Image Processing, pp. 492-503, Dec. 2006.
[2] M. Barnard, E. Holden, and R. Owens, "Lip Tracking Using Pattern Matching Snakes," Proc. Fifth Asian Conf. Computer Vision, Jan. 2002.
[3] C. Bregler and S. Omohundro, "Nonlinear Manifold Learning for Visual Speech Recognition," Proc. Fifth Int'l Conf. Computer Vision, pp. 494-499, 1995.
[4] M. Burl, T. Leung, and P. Perona, "Face Localization via Shape Statistics," Proc. First IEEE Int'l Conf. Automatic Face and Gesture Recognition, pp. 154-159, 1995.
[5] I. Castelli, M. Maggini, S. Melacci, and L. Sarti, "Auto Associative Neural Network Based Active Shape Models," Proc. Eighth IEEE Int'l Conf. Face and Gesture, pp. 1-6, 2008.
[6] T. Cootes, G. Edwards, and C. Taylor, "Active Appearance Models," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 6, pp. 681-685, June 2001.
[7] T. Cootes, C. Taylor, D. Cooper, and J. Graham, "Active Shape Models—Their Training and Application," Computer Vision and Image Understanding, vol. 61, no. 1, pp. 38-59, Jan. 1995.
[8] D. Cristinacce and T. Cootes, "Feature Detection and Tracking with Constrained Local Models," Proc. British Machine Vision Conf., pp. 929-938, 2006.
[9] L. Ding and A. Martinez, "Features versus Context: An Approach for Precise and Detailed Detection and Delineation of Faces and Facial Features," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 32, no. 11, pp. 2022-2038, Nov. 2010.
[10] F. Dornaika and J. Ahlberg, "Face Model Adaptation Using Robust Matching and Active Appearance Models," Proc. IEEE Workshop Applications of Computer Vision, pp. 3-7, 2002.
[11] F. Dornaika and J. Ahlberg, "Fitting Third Face Models for Tracking and Active Appearance Model Training," Image and Vision Computing, vol. 24, no. 9, pp. 1010-1024, 2006.
[12] M. Hamouz, J. Kittler, J.-K. Kamarainen, P. Paalanen, and H. Kalviainen, "Affine-Invariant Face Detection and Localization Using GMM-Based Feature Detector and Enhanced Appearance Model," Proc. Sixth IEEE Int'l Conf. Automatic Face and Gesture Recognition, pp. 67-72, May 2004.
[13] J. Hoey, "Tracking Using Flocks of Features, with Application to Assisted Handwashing," Proc. British Machine Vision Conf., pp. 367-376, 2006.
[14] M. Kolsch and M. Turk, "Fast 2D Hand Tracking with Flocks of Features and Multi-Cue Integration," Proc. Conf. Computer Vision and Pattern Recognition Workshop, vol. 10, p. 158, 2004.
[15] M. Lievin, P. Delmas, P. Coulon, F. Luthon, and V. Fristol, "Automatic Lip Tracking: Bayesian Segmentation and Active Contours in a Cooperative Scheme," Proc. IEEE Int'l Conf. Multimedia Computing and Systems, vol. 1, pp. 691-696, July 1999.
[16] X. Liu, "Discriminative Face Alignment," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 11, pp. 1941-1954, Nov. 2009.
[17] M. Lourakis, "Homest: A C/C++ Library for Robust, Non-Linear Homography Estimation," http://www.ics.forth.gr/lourakishomest/, July 2006.
[18] B. Lucas and T. Kanade, "An Iterative Image Registration Technique with an Application to Stereo Vision," Proc. Int'l Joint Conf. Artificial Intelligence, pp. 674-679, 1981.
[19] I. Matthews and S. Baker, "Active Appearance Models Revisited," Int'l J. Computer Vision, vol. 60, no. 1, pp. 135-164, Nov. 2004.
[20] I. Matthews, T. Cootes, and J. Bangham, "Extraction of Visual Features for Lipreading," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 2, pp. 198-213, Feb. 2002.
[21] S. Milborrow and F. Nicolls, "Locating Facial Features with an Extended Active Shape Model," Proc. 10th European Conf. Computer Vision, pp. 504-513, 2008.
[22] M. Nguyen and F. Torre, "Local Minima Free Parameterized Appearance Models," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[23] E. Ong and R. Bowden, "Robust Lip-Tracking Using Rigid Flocks of Selected Linear Predictors," Proc. Eighth IEEE Conf. Automatic Face and Gesture Recognition, 2008.
[24] E. Ong, Y. Lan, B. Theobald, R. Harvey, and R. Bowden, "Robust Facial Feature Tracking Using Selected Multi-Resolution Linear Predictors," Proc. 12th Int'l Conf. Computer Vision, 2009.
[25] I. Patras and E. Hancock, "Regression Tracking with Data Relevance Determination," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1-8, June 2007.
[26] F. Sukno, S. Ordas, C. Butakoff, S. Cruz, and A. Frangi, "Active Shape Models with Invariant Optimal Features: Application to Facial Analysis," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 7, pp. 1105-1117, July 2007.
[27] J. Sung, T. Kanade, and D. Kim, "Pose Robust Face Tracking by Combining Active Appearance Models and Cylinder Head Models," Int'l J. Computer Vision, vol. 80, no. 2, pp. 260-274, 2008.
[28] B. Theobald, I. Matthews, and S. Baker, "Evaluating Error Functions for Robust Active Appearance Models," Proc. Seventh Int'l Conf. Face and Gesture Recognition, pp. 149-154, 2006.
[29] C. Tomasi and T. Kanade, "Detection and Tracking of Point Features," Technical Report CMU-CS-91-132, Carnegie Mellon Univ., Apr. 1991.
[30] O. Williams, A. Blake, and R. Cipolla, "Sparse Bayesian Learning for Efficient Visual Tracking," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 8, pp. 1292-1304, Aug. 2005.
[31] H. Wu, X. Liu, and G. Doretto, "Face Alignment via Boosted Ranking Models," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[32] Z. Wu, P. Aleksic, and A. Katsaggelos, "Lip Tracking for MPEG-4 Facial Animation," Proc. Fourth IEEE Conf. Multimodal Interfaces, 2002.
[33] J. Xiao, S. Baker, I. Matthews, and T. Kanade, "Real-Time Combined 2D+3D Active Appearance Models," Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 535-542, June 2004.
[34] K. Yow and R. Cipolla, "A Probabilistic Framework for Perceptual Grouping of Features in Human Face Detection," Proc. Second IEEE Int'l Conf. Automatic Face and Gesture Recognition, pp. 16-21, 1996.
[35] A. Yulle, P. Hallinan, and D. Cohen, "Feature Extraction from Faces Using Deformable Templates," Int'l J. Computer Vision, vol. 8, no. 2, pp. 99-111, 1992.
[36] L. Zhang, H. Ai, S. Xin, C. Huang, S. Tsukiji, and S. Lao, "Robust Face Alignment Based on Local Texture Classifiers," Proc. IEEE Int'l Conf. Image Processing, vol. 2, pp. II-354-II-357, Sept. 2005.
[37] Z. Zhang, R. Deriche, O. Faugeras, and Q. Luong, "A Robust Technique for Matching Two Uncalibrated Images through the Recovery of the Unknown Epipolar Geometry," Artificial Intelligence, vol. 78, nos. 1/2, pp. 87-119, 1995.
[38] K. Zimmermann, J. Matas, and T. Svoboda, "Tracking by an Optimal Sequence of Linear Predictors," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 4, pp. 677-692, Apr. 2009.
52 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool