| | This Article | |
| |
| |
| | Share | |
| |
| |
| | Bibliographic References | |
| |
| |
| | Add to: | |
| |
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
| |
| | Search | |
| |
| |
| | |
Classifying Facial Actions
October 1999 (vol. 21 no. 10)
pp. 974-989
Abstract—The Facial Action Coding System (FACS) [[23]] is an objective method for quantifying facial movement in terms of component actions. This system is widely used in behavioral investigations of emotion, cognitive processes, and social interaction. The coding is presently performed by highly trained human experts. This paper explores and compares techniques for automatically recognizing facial actions in sequences of images. These techniques include analysis of facial motion through estimation of optical flow; holistic spatial analysis, such as principal component analysis, independent component analysis, local feature analysis, and linear discriminant analysis; and methods based on the outputs of local filters, such as Gabor wavelet representations and local principal components. Performance of these systems is compared to naive and expert human subjects. Best performances were obtained using the Gabor wavelet representation and the independent component representation, both of which achieved 96 percent accuracy for classifying 12 facial actions of the upper and lower face. The results provide converging evidence for the importance of using local filters, high spatial frequencies, and statistical independence for classifying facial actions.
[1] 974 J. Atick and N. Redlich, "What Does the Retina Know about Natural Scene?" Neural Computation, Vol. 4, No. 2, 1992, pp. 196-210.[2] M.S. Bartlett, “Face Image Analysis by Unsupervised Learning and Redundancy Reduction,” PhD thesis, Univ. of California, San Diego, 1998.[3] M.S. Bartlett, J.C. Hager, P. Ekman, and T.J. Sejnowski, “Measuring Facial Expressions by Computer Image Analysis,” Psychophysiology, vol. 36, pp. 253-263, 1999.[4] M.S. Bartlett, H.M. Lades, and T.J. Sejnowski, “Independent Component Representations for Face Recognition,” Proc. SPIE Symp. Electronic Imaging: Science and Technology; Human Vision and Electronic Imaging III, T. Rogowitz and B. Pappas, eds., vol. 3,299, pp. 528-539, San Jose, Calif., 1998.[5] M.S. Bartlett and T.J. Sejnowski, “Viewpoint Invariant Face Recognition Using Independent Component Analysis and Attractor Networks,” Advances in Neural Information Processing Systems, M. Mozer, M. Jordan, and T. Petsche, eds., vol. 9, pp. 817-823, Cambridge, Mass., 1997.[6] M.S. Bartlett, P.A. Viola, T.J. Sejnowski, J. Larsen, J. Hager, and P. Ekman, “Classifying Facial Action,” Advances in Neural Information Processing Systems, D. Touretski, M. Mozer, and M. Hasselmo, eds., vol. 8, pp. 823-829, 1996.[7] J. Bassili, “Emotion Recognition: The Role of Facial Movement and the Relative Importance of Upper and Lower Areas of the Face,” J. Personality and Social Psychology, vol. 37, pp. 2,049-2,059, 1979.[8] P.N. Belhumeur, J. Hespanda, and D. Kriegeman, Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 711-720, July 1997.[9] A.J. Bell and T.J. Sejnowski, An Information-Maximization Approach to Blind Separation and Blind Deconvolution Neural Computation, vol. 7, no. 6, June 1995.[10] A.J. Bell and T.J. Sejnowski, “The Independent Components of Natural Scenes Are Edge Filters,” Vision Research, vol. 37, no. 23, pp. 3,327-3,338, 1997.[11] D Beymer and T. Poggio, “Image Representations for Visual Learning,” Science, vol. 272, no. 5,270, pp. 1,905-1,909, 1996.[12] D. Beymer, A. Shashua, and T. Poggio, "Example Based Image Analysis and Synthesis," M.I.T. A.I. Memo No. 1431, 1993.[13] R. Brunelli and T. Poggio, "Face Recognition: Features vs. Templates," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, no. 10, pp. 1,042-1,053, Oct. 1993.[14] R. Chellappa, “Discriminant Analysis for Face Recognition,” Face Recognition: From Theory to Applications, H. Wechsler, P.J. Phillips, V. Bruce, F. Fogelman-Soulie, and T. Huang, eds., Springer-Verlag, 1998.[15] J.F. Cohn, A.J. Zlochower, J.J. Lien, Y.-T. Wu, and T. Kanade, “Automated Face Coding: A Computer-Vision Based Method of Facial Expression Analysis,” Psychophysiology, vol. 35, no. 1, pp. 35-43, 1999.[16] P. Comon, “Independent Component Analysis, a New Concept?” Signal Processing, vol. 36, no. 3, 1994.[17] G. Cottrell and J. Metcalfe, “Face, Gender and Emotion Recognition Using Holons,” Advances in Neural Information Processing Systems, D. Touretzky, ed., vol. 3, pp. 564-571, San Mateo, Calif.: Morgan Kaufmann, 1991.[18] G.W. Cottrell and M.K. Fleming, “Face Recognition Using UnSupervised Feature Extraction,” Proc. Int'l Neural Network Conf., pp. 322-325, Dordrecht, Germany, 1990.[19] K.D. Craig, S.A. Hyde, and C.J. Patrick, “Genuine, Suppressed, and Faked Facial Behavior During Exacerbation of Chronic Low Back Pain,” Pain, vol. 46, pp. 161-172, 1991.[20] J.G. Daugman, “Complete Discrete 2D Gabor Transforms by Neural Networks for Image Analysis and Compression,” IEEE Trans. Acoustics, Speech, and Signal Processing, vol. 36, no. 7, 1988.[21] R. DeValois and K. DeValois, Spatial Vision. Oxford Press, 1988.[22] P. Ekman, Telling Lies: Clues to Deceit in the Marketplace, Politics, and Marriage, first ed. New York: W.W. Norton, 1985.[23] P. Ekman and W. Friesen, Facial Action Coding System: A Technique for the Measurement of Facial Movement. Palo Alto, Calif.: Consulting Psychologists Press, 1978.[24] P. Ekman, W. Friesen, and M. O'Sullivan, “Smiles When Lying,” J. Personality and Social Psychology, vol. 545, pp. 414-420, 1988.[25] P. Ekman and E.L. Rosenberg, What the Face Reveals: Basic and Applied Studies of Spontaneous Expression using the Facial Action Coding System (FACS). New York: Oxford Univ. Press, 1997.[26] I.A. Essa and A.P. Pentland, “Coding, Analysis, Interpretation, and Recognition of Facial Expressions,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 757-763, July 1997.[27] D.J. Field, "What Is the Goal of Sensory Coding," Neural Computation, vol. 6, no. 4, July 1994, pp. 559-601.[28] R.A. Fisher, “The Use of Multiple Measures in Taxonomic Problems,” Ann. Eugenics, vol. 7, pp. 179-188, 1936.[29] B.A. Golomb, D.T. Lawrence, and T.J. Sejnowski, “Sexnet: A Neural Network Identifies Sex from Human Faces,” Advances in Neural Information Processing Systems, R.P. Lippman, J. Moody, and D.S. Touretzky, eds., vol. 3, pp. 572-577, San Mateo, Calif.: Morgan Kaufmann, 1991.[30] M.S. Gray, J. Movellan, and T.J. Sejnowski, “A Comparison of Local versus Global Image Decomposition for Visual Speechreading,” Proc. Fourth Joint Symp. Neural Computation, pp. 92-98, Inst. for Neural Computation, La Jolla, Calif., 1997.[31] J. Hager, P. Ekman, “The Essential Behavioral Science of the Face and Gesture that Computer Scientists Need to Know,” Proc. Int'l Workshop Automatic Face- and Gesture-Recognition, M. Bichsel, ed., pp. 7-11, 1995.[32] P. Hallinan, “A Deformable Model for Face Recognition Under Arbitrary Lighting Conditions,” PhD thesis, Harvard Univ., 1995.[33] D. Heeger, “Nonlinear Model of Neural Responses in Cat Visual Cortex,” Computational Models of Visual Processing, M. Landy and J. Movshon, eds., pp. 119-133, Cambridge, Mass.: MIT Press, 1991.[34] M. Heller and V. Haynal, “The Faces of Suicidal Depression (translation Les Visages de la Depression de Suicide),” Kahiers Psychiatriques Genevois (Medecine et Hygiene Editors), vol. 16, pp. 107-117, 1994.[35] W. Himer, F. Schneider, G. Kost, and H. Heimann, “Computer-Based Analysis of Facial Action: A New Approach,” J. Psychophysiology, vol. 5, no. 2, pp. 189-195, 1991.[36] J. Jones and L. Palmer, “An Evaluation of the Two Dimensional Gabor Filter Model of Simple Receptive Fields in Cat Striate Cortex,” J. Neurophysiology, vol. 58, pp. 1,233-1,258, 1987.[37] S. Kaiser and T. Wherle, “Automated Coding of Facial Behavior in Human-Computer Interactions with FACS,” J. Nonverbal Behavior, vol. 16, no. 2, pp. 65-140, 1992.[38] S. Kanfer, Serious Business: The Art and Commerce of Animation in America from Betty Boop to Toy Story. New York: Scribner, 1997.[39] M. Lades, J.C. Vorbruggen, J. Buhmann, J. Lange, C. von der Malsburg, R.P. Wurtz, and W. Konen, “Distortion Invariant Object Recognition in the Dynamic Link Architecture,” IEEE Trans. Computers, vol. 42, no. 3, pp. 300-311, Mar. 1993.[40] A. Lanitis, C.J. Taylor, and T.F. Cootes, “Automatic Interpretation and Coding of Face Images using Flexible Models,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 743-756, July 1997.[41] M. Lewicki and B. Olshausen, “Inferring Sparse, Overcomplete Image Codes Using an Efficient Coding Framework,” Advances in Neural Information Processing Systems, M. Jordan, ed., vol. 10, San Mateo, Calif.: Morgan Kaufmann, in press. [42] H. Li, P. Roivainen, and R. Forchheimer, "3D Motion Estimation in Model-Based Facial Image Coding," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, no. 6, pp. 545-555, June 1993.[43] J.J. Lien, T. Kanade, J.F. Cohn, and C.C. Li, “A Multi-Method Approach for Discriminating between Similar Facial Expressions, Including Expression Intensity Information,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, June 1998.[44] K. Mase, “Recognition of Facial Expression from Optical Flow,” IEICE Trans. E, vol. 74, no. 10, pp. 3,474-3,483 1991.[45] S.J. McKelvie, “Emotional Expression in Upside-Down Faces: Evidence for Configurational and Componential Processing,” British J. Social Psychology, vol. 34, no. 3, pp. 325-334, 1995.[46] J.R. Movellan, “Visual Speech Recognition with Stochastic Networks,” Advances in Neural Information Processing Systems, G. Tesauro, D.S. Touretzky, and T. Leen, eds., vol. 7, pp. pages 851-858, Cambridge, Mass.: MIT Press, 1995.[47] J-P. Nadal and N. Parga, “Non-Linear Neurons in the Low Noise Limit: A Factorial Code Maximizes Information Transfer,” Network, vol. 5, pp. 565-581, 1994.[48] C. Padgett and G. Cottrell, “Representing Face Images for Emotion Classification,” Advances in Neural Information Processing Systems, M. Mozer, M. Jordan, and T. Petsche, eds., vol. 9, Cambridge, Mass.: MIT Press, 1997.[49] P.S. Penev and J.J. Atick, “Local Feature Analysis: A General Statistical Theory for Object Representation,” Network: Computation in Neural Systems, vol. 7, no. 3, pp. 477-500, 1996.[50] M.L. Phillips, A.W. Young, C. Senior, C. Brammer, M. Andrews, A.J. Calder, E.T. Bullmore, D.I. Perrett, D. Rowland, S.C.R. Williams, A.J. Gray, and A.S. David, “A Specific Neural Substrate for Perceiving Facial Expressions of Disgust,” Nature, vol. 389, pp. 495-498, 1997.[51] P.J. Phillips, H. Wechsler, J. Juang, and P.J. Rauss, “The Feret Database and Evaluation Procedure for Face-Recognition Algorithms,” Image and Vision Computing J., vol. 16, no. 5, pp. 295-306, 1998.[52] D.A. Pollen and S.F. Ronner, “Phase Relationship between Adjacent Simple Cells in the Visula Cortex,” Science, vol. 212, pp. 1,409-1,411, 1981.[53] W.K. Pratt, Digital Image Processing, John Wiley&Sons, New York, 1978.[54] M. Rosenblum, Y. Yacoob, and L.S. Davis, “Human Expression Recognition from Motion Using a Radial Basis Function Network Archtecture,” IEEE Trans. Neural Network, vol. 7, no. 5, pp. 1121-1138, 1996.[55] M. Rydfalk, “CANDIDE: A Parametrized Face,” PhD thesis, Linkoping Univ., Dept. of Electrical Eng., Oct. 1987.[56] A. Shashua, “Geometry and Photometry in 3D Visual Recognition,” PhD dissertation, Dept. of Brain and Cognitive Sciences, Massachusetts Inst. of Technology, Cambridge, Nov. 1992.[57] F. Silla, M.P. Malumbres, A. Robles, P. López, and J. Duato, Efficient Adaptive Routing in Networks of Workstations with Irregular Topology Proc. Workshop Comm. and Architectural Support for Network-Based Parallel Computing, Feb. 1997.[58] A. Singh, Optic Flow Computation. Los Alamitos, Calif.: IEEE CS Press, 1991.[59] D. Terzopoulos and K. Waters, "Analysis and Synthesis of Facial Image Sequences Using Physical and Anatomical Models," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, no. 6, pp. 569-579, 1993.[60] M. Turk and A. Pentland, “Eigenfaces for Recognition,” J. Cognitive Neuroscience, vol. 3, no. 1, pp. 71-86, 1991.[61] T. Vetter and T. Poggio, "Linear Object Classes and Image Synthesis from Single Example Image," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 733-741, July 1997.[62] T. Vetter and N.E. Troje, “Separation of Texture and Shape in Images of Faces for Image Coding and Synthesis,” J. Optical Soc. Am. A (Optics, Image Science, and Vision), vol. 14, no. 9, pp. 2,152-2,161, 1997.[63] H. Wallbott, “Effects of Distortion of Spatial and Temporal Resolution of Video Stimuli on Emotion Attributions,” J. Nonverbal Behavior, vol. 15, no. 6, pp. 5-20, 1992.[64] Y. Yacoob and L.S. Davis, “Recognizing Human Facial Expression from Long Image Sequences Using Optical Flow,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 6, pp. 636-642, June 1996.[65] J. Zhang, Y. Yan, and M. Lades, “Face Recognition: Eigenface, Elastic Matching, and Neural Nets,” Proc. IEEE, vol. 85, no. 9, pp. 1423-1435, Sept. 1997.[66] Z. Zhang, “Feature-Based Facial Expression Recognition: Sensitivity Analysis and Experiments with a Multi-Layer Perceptron,” Int'l J. Pattern Recognition and Artificial Intelligence, in press.
Index Terms:
Computer vision, facial expression recognition, independent component analysis, principal component analysis, Gabor wavelets, Facial Action Coding System.
Citation:
Gianluca Donato, Marian Stewart Bartlett, Joseph C. Hager, Paul Ekman, Terrence J. Sejnowski, "Classifying Facial Actions," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 21, no. 10, pp. 974-989, Oct. 1999, doi:10.1109/34.799905