This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Active and Dynamic Information Fusion for Facial Expression Understanding from Image Sequences
May 2005 (vol. 27 no. 5)
pp. 699-714
Qiang Ji, IEEE
This paper explores the use of multisensory information fusion technique with Dynamic Bayesian networks (DBNs) for modeling and understanding the temporal behaviors of facial expressions in image sequences. Our facial feature detection and tracking based on active IR illumination provides reliable visual information under variable lighting and head motion. Our approach to facial expression recognition lies in the proposed dynamic and probabilistic framework based on combining DBNs with Ekman's Facial Action Coding System (FACS) for systematically modeling the dynamic and stochastic behaviors of spontaneous facial expressions. The framework not only provides a coherent and unified hierarchical probabilistic framework to represent spatial and temporal information related to facial expressions, but also allows us to actively select the most informative visual cues from the available information sources to minimize the ambiguity in recognition. The recognition of facial expressions is accomplished by fusing not only from the current visual observations, but also from the previous visual evidences. Consequently, the recognition becomes more robust and accurate through explicitly modeling temporal behavior of facial expression. In this paper, we present the theoretical foundation underlying the proposed probabilistic and dynamic framework for facial expression modeling and understanding. Experimental results demonstrate that our approach can accurately and robustly recognize spontaneous facial expressions from an image sequence under different conditions.

[1] M.S. Bartlett, B. Braathen, G.L. Littlewort-Ford, J. Hershey, J. Fasel, T. Mark, E. Smith, T.J. Sejnowski, and J.R. Movellan, “Automatic Analysis of Spontaneous Facial Behavior: A Final Project Report,” Technical Report MPLab-TR2001.08, Univ. of California at San Diego, Dec. 2001.
[2] P. Ekman and W.V. Friesen, Facial Action Coding System (FACS): Manual. Palo Alto, Calif: Consulting Psychologists Press, 1978.
[3] M. Kato, I. So, Y. Hishnuma, O. Nakamura, and T. Minami, “Description and Synthesis of Facial Expressions Based on Isodensity Maps,” Visual Computing, T. Kunii, ed., pp. 39-56, 1991.
[4] G.W. Cottrell and J. Metcalfe, “Face, Emotion, Gender Recognition Using Holos,” Advances in NIPS, R. P. Lippman, ed., pp. 564-71, 1991.
[5] A. Rahardja, A. Sowmya, and W.H. Wilson, “A Neural Network Approach to Component versus Holistic Recognition of Facial Expressions in Images,” Proc. SPIE, Intelligent Robots and Computer Vision X: Algorithms and Techniques, vol. 1607, pp. 62-70, 1991.
[6] H. Kobayashi and F. Hara, “Recognition of Six Basic Facial Expressions and Their Strength by Neural Network,” Proc. Int'l Workshop Robot and Human Comm., pp. 381-386, 1992.
[7] G.D. Kearney and S. McKenzie, “Machine Interpretation of Emotion: Design of Memory-Based Expert System for Interpreting Facial Expressions in Terms of Signaled Emotions (JANUS),” Cognitive Science, vol. 17, no. 4, pp. 589-622, 1993.
[8] H. Ushida, T. Takagi, and T. Yamaguchi, “Recognition of Facial Expressions Using Conceptual Fuzzy Sets,” Proc. IEEE Int'l Conf. Fuzzy Systems, pp. 594-599, 1993.
[9] K. Mase, “Recognition of Facial Expression from Optical Flow,” IEICE Trans., vol. E74, no. 10, pp. 3474-3483, 1991.
[10] Y. Yacoob and L. Davis, “Recognition Facial Expressions by Spatio-Temporal Analysis,” Proc. Int'l Conf. Pattern Recognition, pp. 747-749, 1994.
[11] M. Rosenblum, Y. Yacoob, and L. Davis, “Human Emotion Recognition from Motion Using a Radial Basis Function Network Architecture,” Proc. IEEE Workshop Motion of Non-Rigid and Articulated Objects, pp. 43-49, 1994.
[12] M. Pantic and L. Rothkrantz, “Automatic Analysis of Facial Expressions: The State of the Art,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 12, pp. 1424-1445, Dec. 2000.
[13] G. Donato, M.S. Bartlett, J.C. Hager, P. Ekman, and T.J. Sejnowski, “Classifying Facial Actions,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 21, no. 10, pp. 974-989, Oct. 1999.
[14] B. Fasel and J. Luettin, “Automatic Facial Expression Analysis: A Survey,” Pattern Recognition, no. 36, pp. 259-275, 2003.
[15] Y. Yacoob and L.S. Davis, “Recognizing Human Facial Expressions from Long Image Sequences Using Optical Flow,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 6, pp. 636-642, June 1996.
[16] I.A. Essa and A.P. Pentland, “Coding, Analysis, Interpretation, and Recognition of Facial Expressions,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 757-763, July 1997.
[17] J.F. Cohn, A.J. Zlochower, J.J. Lien, and T. Kanade, “Feature-Point Tracking by Optical Flow Discriminates Subtle Difference in Face Expression,” Proc. IEEE Int'l Conf. Automatic Face and Gesture Recognition, pp. 396-401, 1998.
[18] D. Terzopoulos and K. Waters, “Analysis and Synthesis of Facial Image Sequence Using Physical and Anatomical Models,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, no. 6, pp. 569-579, June 1993.
[19] N.M. Thalmann, P. Kalra, and M. Escher, “Face to Virtual Face,” Proc. IEEE, vol. 86, no. 5, pp. 870-883, 1998.
[20] A. Lanitis, C.J. Taylor, and T.F. Cootes, “Automatic Interpretation and Coding of Face Images Using Flexible Models,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 743-756, July 1997.
[21] M. Pantic and L. Rothkrantz, “Expert System for Automatic Analysis of Facial Expression,” J. Image and Vision Computing, vol. 18, no. 11, pp. 881-905, 2000.
[22] Y. Tian, T. Kanade, and J.F. Cohn, “Recognizing Action Units for Facial Expression Analysis,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 2, pp. 97-115, Feb. 2001.
[23] S.M. Bartlett, P.A. Viola, T.J. Sejnowski, B.A. Golomb, J. Larsen, J.C. Hager, and P. Ekman, “Classifying Facial Action,” Advances in Neural Information Processing Systems 8, D. Touretzki, M. Mozer, and M. Hasselmo, eds., pp. 823-829, 1996.
[24] C. Padgett and G. Cottrell, “Representing Face Images for Emotion Classification,” Advances in Neural Information Processing Systems, M. Mozer, M. Jordan, and T. Petsche, eds., vol. 9, 1997.
[25] M.J. Lyons, J. Budynek, and S. Akamatsu, “Automatic Classification of Single Facial Images,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 21, no. 12, pp. 1357-1362, Dec. 1999.
[26] P.N. Belhumeor, J.P. Hespanha, and D.J. Kriegman, “Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Project,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, no. 6, pp. 711-720, June 1997.
[27] C. Huang and Y. Huang, “Facial Expression Recognition Using Model-Based Feature Extraction and Action,” J. Visual Comm. and Image Representation, vol. 8, no. 3, pp. 278-290, 1997.
[28] Z. Zhu, Q. Ji, K. Fujimura, and K. Lee, “Combining Kalman Filtering and Mean Shift for Real Time Eye Tracking under Active IR Illumination,” Proc. Int'l Conf. Pattern Recognition, Aug. 2002.
[29] J. Zhao and G. Kearney, “Classifying Facial Emotions by Backpropagation Neural Networks with Fuzzy Inputs,” Proc. Int'l Conf. Neural Information Processing, pp. 454-457, 1996.
[30] Z. Zhang, M. Lyons, M. Schuster, and S. Akamastsu, “Comparison Between Geometry-Based and Gabor Wavelets-Based Facial Expression Recognition Using Multi-Layer Perception,” Proc. Int'l Conf. Automatic Face and Gesture Recognition, pp. 454-459, 1998.
[31] A. Colmenarez, B. Frey, and T.S. Huang, “A Probabilistic Framework for Embedded Face and Facial Expression Recognition,” Proc. Int'l Conf. Computer Vision and Pattern Recognition, 1999.
[32] J.N. Bassili, “Emotion Recognition: The Role of Facial Movement and the Relative Importance of Upper and Lower Area of the Face,” J. Personality and Social Psychology, vol. 37, pp. 2049-2059, 1979.
[33] M. Rosenblum, Y. Yacoob, and L.S. Davis, “Human Expression Recognition from Motion Using a Radial Basis Function Network Architecture,” IEEE Trans. Neural Networks, vol. 7, no. 5, pp. 1121-1137, 1996.
[34] M.J. Black and Y. Yacoob, “Recognizing Facial Expression in Image Sequences Using Local Parameterized Models of Image Motion,” Int'l J. Computer Vision, vol. 25, no. 1, pp. 23-48, 1997.
[35] N. Oliver, A. Pentland, and F. Bérard, “LAFTER: Lips and Face Real Time Tracker with Facial Expression Recognition,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 1997.
[36] J.J. Lien, T. Kanade, J.F. Cohn, and C. Li, “Detection, Tracking, and Classification of Action Units in Facial Expression,” J. Robotics and Autonomous Systems, vol. 31, pp. 131-146, 1997.
[37] Y. Zhang and Q. Ji, “Facial Expression Understanding in Image Sequences Using Dynamic and Active Visual Information Fusion,” Proc. Ninth IEEE Int'l Conf. Computer Vision, 2003.
[38] F. Pighin, R. Szeliski, and D. Salesin, “Modeling and Animating Realistic Faces from Images,” Int'l J. Computer Vision, vol. 50, no. 2, pp. 143-169, 2002.
[39] H. Tao and T. Huang, “Visual Estimation and Compression of Facial Motion Parameters: Elements of a 3D Model-Based Video Coding System,” Int'l J. Computer Vision, vol. 50, no. 2, pp. 111-125, 2002.
[40] S. Goldenstein, C. Vogler, and D. Metaxas, “Statistical Cue Integration in DAG Deformable Models,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 7, pp. 801-813, July 2003.
[41] C. Morimoto, D. Koons, A. Amir, and M. Flicker, “Framerate Pupil Detector and Gaze Tracker,” Proc. IEEE Int'l Conf. Computer Vision Frame-Rate Workshop, Sept. 1999.
[42] P.S. Maybeck, Stochastic Models, Estimation, and Control. Academic Press, Inc., 1979.
[43] D. Comaniciu and P. Meer, “Mean Shift: A Robust Approach toward Feature Space Analysis,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 5, pp. 603-619, May 2002.
[44] L.G. Farkas, Anthropometry of the Head and Face. New York: Raven Press, 1994.
[45] L. Wiskott, J.-M. Fellous, N. Kruger, and C.V. Malsburg, “Face Recognition by Elastic Bunch Graph Matching,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 775-779, July 1997.
[46] P. Ekman, “Facial Expressions of Emotion: An Old Controversy and New Findings,” Philosophical Trans. Royal Soc. London, vol. B, no. 335, pp. 63-69, 1992.
[47] V.I. Pavlovic, “Dynamic Bayesian Networks for Information Fusion with Applications to Human-Computer Interfaces,” PhD thesis, Univ. of Illinois at Urbana-Champaign, 1999.
[48] L. Rabiner, “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition,” Proc. IEEE, vol. 77, no. 2, pp. 257-286, 1989.
[49] J. Pearl, Probability Reasoning in Intelligent Systems: Networks of Plausible Inference. San Mateo, Calif.: Morgan Kaufmann, 1988.
[50] C.E. Shannon, “A Mathematical Theory of Communication,” Bell System Technical J., vol. 27, pp. 379-423, 1948.
[51] MPEG, “ISO/IEC 14496-MPEG-4 International Standard,” 1998.

Index Terms:
Facial expression analysis, dynamic Bayesian networks, visual information fusion, active sensing.
Citation:
Yongmian Zhang, Qiang Ji, "Active and Dynamic Information Fusion for Facial Expression Understanding from Image Sequences," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 5, pp. 699-714, May 2005, doi:10.1109/TPAMI.2005.93
Usage of this product signifies your acceptance of the Terms of Use.