This Article 
 Bibliographic References 
 Add to: 
A Multimodal Database for Affect Recognition and Implicit Tagging
Jan.-March 2012 (vol. 3 no. 1)
pp. 42-55
M. Soleymani, Comput. Sci. Dept., Univ. of Geneva, Carouge, Switzerland
J. Lichtenauer, Dept. of Comput., Imperial Coll. London, London, UK
T. Pun, Comput. Sci. Dept., Univ. of Geneva, Carouge, Switzerland
M. Pantic, Dept. of Comput., Imperial Coll. London, London, UK
MAHNOB-HCI is a multimodal database recorded in response to affective stimuli with the goal of emotion recognition and implicit tagging research. A multimodal setup was arranged for synchronized recording of face videos, audio signals, eye gaze data, and peripheral/central nervous system physiological signals. Twenty-seven participants from both genders and different cultural backgrounds participated in two experiments. In the first experiment, they watched 20 emotional videos and self-reported their felt emotions using arousal, valence, dominance, and predictability as well as emotional keywords. In the second experiment, short videos and images were shown once without any tag and then with correct or incorrect tags. Agreement or disagreement with the displayed tags was assessed by the participants. The recorded videos and bodily responses were segmented and stored in a database. The database is made available to the academic community via a web-based system. The collected data were analyzed and single modality and modality fusion results for both emotion recognition and implicit tagging experiments are reported. These results show the potential uses of the recorded modalities and the significance of the emotion elicitation protocol.

[1] M. Pantic and A. Vinciarelli, “Implicit Human-Centered Tagging,” IEEE Signal Processing Magazine, vol. 26, no. 6, pp. 173-180, Nov. 2009.
[2] Z. Zeng, M. Pantic, G.I. Roisman, and T.S. Huang, “A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 1, pp. 39-58, Mar. 2009.
[3] I. Laptev, M. Marszalek, C. Schmid, and B. Rozenfeld, “Learning Realistic Human Actions from Movies,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1-8, June 2008.
[4] J.A. Healey and R.W. Picard, “Detecting Stress during Real-World Driving Tasks Using Physiological Sensors,” IEEE Trans. Intelligent Transportation Systems, vol. 6, no. 2, pp. 156-166, June 2005.
[5] M. Pantic, M. Valstar, R. Rademaker, and L. Maat, “Web-Based Database for Facial Expression Analysis,” Proc. IEEE Int'l Conf. Multimedia and Expo, pp. 317-321, 2005.
[6] E. Douglas-Cowie, R. Cowie, I. Sneddon, C. Cox, O. Lowry, M. McRorie, J.-C. Martin, L. Devillers, S. Abrilian, A. Batliner, N. Amir, and K. Karpouzis, “The HUMAINE Database: Addressing the Collection and Annotation of Naturalistic and Induced Emotional Data,” Proc. Second Int'l Conf. Affective Computing and Intelligent Interaction, A. Paiva et al., pp. 488-500, 2007.
[7] M. Grimm, K. Kroschel, and S. Narayanan, “The Vera am Mittag German Audio-Visual Emotional Speech Database,” Proc. IEEE Int'l Conf. Multimedia and Expo, pp. 865-868, Apr. 2008.
[8] G. McKeown, M.F. Valstar, R. Cowie, and M. Pantic, “The SEMAINE Corpus of Emotionally Coloured Character Interactions,” Proc. IEEE Int'l Conf. Multimedia and Expo, pp. 1079-1084, July 2010.
[9] S. Koelstra, C. Mühl, M. Soleymani, A. Yazdani, J.-S. Lee, T. Ebrahimi, T. Pun, A. Nijholt, and I. Patras, “DEAP: A Database for Emotion Analysis Using Physiological Signals,” IEEE Trans. Affective Computing, vol. 3, no. 1, pp. 18-31, Jan.-Mar. 2012.
[10] M.F. Valstar and M. Pantic, “Induced Disgust, Happiness and Surprise: An Addition to the MMI Facial Expression Database,” Proc. Int'l Conf. Language Resources and Evaluation, Workshop EMOTION, pp. 65-70, May 2010.
[11] E. Douglas-cowie, R. Cowie, and M. Schröder, “A New Emotion Database: Considerations, Sources and Scope,” Proc. ISCA Int'l Technical Research Workshop Speech and Emotion, pp. 39-44, 2000.
[12] J.A. Russell, “Culture and the Categorization of Emotions,” Psychological Bull., vol. 110, no. 3, pp. 426-450, 1991.
[13] J.A. Russell and A. Mehrabian, “Evidence for a Three-Factor Theory of Emotions,” J. Research in Personality, vol. 11, no. 3, pp. 273-294, Sept. 1977.
[14] J.R.J. Fontaine, K.R. Scherer, E.B. Roesch, and P.C. Ellsworth, “The World of Emotions Is Not Two-Dimensional,” Psychological Science, vol. 18, no. 12, pp. 1050-1057, 2007.
[15] M. Soleymani, J. Davis, and T. Pun, “A Collaborative Personalized Affective Video Retrieval System,” Proc. Third Int'l Conf. Affective Computing and Intelligent Interaction and Workshops, Sept. 2009.
[16] J.D. Morris, “Observations: Sam: The Self-Assessment Manikin; An Efficient Cross-Cultural Measurement of Emotional Response,” J. Advertising Research, vol. 35, no. 8, pp. 63-38, 1995.
[17] A. Schaefer, F. Nils, X. Sanchez, and P. Philippot, “Assessing the Effectiveness of a Large Database of Emotion-Eliciting Films: A New Tool for Emotion Researchers,” Cognition and Emotion, vol. 24, no. 7, pp. 1153-1172, 2010.
[18] J. Rottenberg, R.D. Ray, and J.J. Gross, “Emotion Elicitation Using Films,” Handbook of Emotion Elicitation and Assessment, series in affective science, pp. 9-28, Oxford Univ. Press, 2007.
[19] The Psychology of Facial Expression, J. Russell and J. Fernandez-Dols, eds. Cambridge Univ. Press, 1997.
[20] D. Keltner and P. Ekman, Facial Expression of Emotion, second ed., pp. 236-249. Guilford Publications, 2000.
[21] T. Kanade, J.F. Cohn, and T. Yingli, “Comprehensive Database for Facial Expression Analysis,” Proc. IEEE Fourth Int'l Conf. Automatic Face and Gesture Recognition, pp. 46-53, 2000.
[22] M. Pantic and L.J.M. Rothkrantz, “Toward an Affect-Sensitive Multimodal Human-Computer Interaction,” Proc. IEEE, vol. 91, no. 9, pp. 1370-1390, Sept. 2003.
[23] S. Petridis and M. Pantic, “Is This Joke Really Funny? Judging the Mirth by Audiovisual Laughter Analysis,” Proc. IEEE Int'l Conf. Multimedia and Expo, pp. 1444-1447, 2009.
[24] M.M. Bradley, Miccoli, Laura, Escrig, A. Miguel, Lang, and J. Peter, “The Pupil as a Measure of Emotional Arousal and Autonomic Activation,” Psychophysiology, vol. 45, no. 4, pp. 602-607, July 2008.
[25] T. Partala and V. Surakka, “Pupil Size Variation as an Indication of Affective Processing,” Int'l J. Human-Computer Studies, vol. 59, nos. 1/2, pp. 185-198, 2003.
[26] P.J. Lang, M.K. Greenwald, M.M. Bradley, and A.O. Hamm, “Looking at Pictures: Affective, Facial, Visceral, and Behavioral Reactions,” Psychophysiology, vol. 30, no. 3, pp. 261-273, 1993.
[27] R. Adolphs, D. Tranel, and A.R. Damasio, “Dissociable Neural Systems for Recognizing Emotions,” Brain and Cognition, vol. 52, no. 1, pp. 61-69, June 2003.
[28] J. Lichtenauer, J. Shen, M. Valstar, and M. Pantic, “Cost-Effective Solution to Synchronised Audio-Visual Data Capture Using Multiple Sensors,” technical report, Imperial College London, 2010.
[29] D. Sander, D. Grandjean, and K.R. Scherer, “A Systems Approach to Appraisal Mechanisms in Emotion,” Neural Networks, vol. 18, no. 4, pp. 317-352, 2005.
[30] P. Rainville, A. Bechara, N. Naqvi, and A.R. Damasio, “Basic Emotions Are Associated with Distinct Patterns of Cardiorespiratory Activity,” Int'l J. Psychophysiology, vol. 61, no. 1, pp. 5-18, July 2006.
[31] R. McCraty, M. Atkinson, W.A. Tiller, G. Rein, and A.D. Watkins, “The Effects of Emotions on Short-Term Power Spectrum Analysis of Heart Rate Variability,” The Am. J. Cardiology, vol. 76, no. 14, pp. 1089-1093, 1995.
[32] R.A. McFarland, “Relationship of Skin Temperature Changes to the Emotions Accompanying Music,” Applied Psychophysiology and Biofeedback, vol. 10, pp. 255-267, 1985.
[33] J. Kim and E. André, “Emotion Recognition Based on Physiological Changes in Music Listening,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 12, pp. 2067-2083, Dec. 2008.
[34] G. Chanel, J.J.M. Kierkels, M. Soleymani, and T. Pun, “Short-Term Emotion Assessment in a Recall Paradigm,” Int'l J. Human-Computer Studies, vol. 67, no. 8, pp. 607-627, Aug. 2009.
[35] S.K. Sutton and R.J. Davidson, “Prefrontal Brain Asymmetry: A Biological Substrate of the Behavioral Approach and Inhibition Systems,” Psychological Science, vol. 8, no. 3, pp. 204-210, 1997.
[36] R.J. Davidson, “Affective Neuroscience and Psychophysiology: Toward a Synthesis,” Psychophysiology, vol. 40, no. 5, pp. 655-665, Sept. 2003.
[37] V.F. Pamplona, M.M. Oliveira, and G.V.G. Baranoski, “Photorealistic Models for Pupil Light Reflex and Iridal Pattern Deformation,” ACM Trans. Graphics, vol. 28, no. 4, pp. 1-12, 2009.
[38] H. Bouma and L.C.J. Baghuis, “Hippus of the Pupil: Periods of Slow Oscillations of Unknown Origin,” Vision Research, vol. 11, no. 11, pp. 1345-1351, 1971.
[39] R.J. Davidson, P. Ekman, C.D. Saron, J.A. Senulis, and W.V. Friesen, “Approach-Withdrawal and Cerebral Asymmetry: Emotional Expression and Brain Physiology I,” J. Personality and Social Psychology, vol. 58, no. 2, pp. 330-341, 1990.
[40] C.-C. Chang and C.-J. Lin, “LIBSVM: A Library for Support Vector Machines,” Science, vol. 2, pp. 1-39, 2001.
[41] J. Jiao and M. Pantic, “Implicit Image Tagging via Facial Information,” Proc. Second Int'l Workshop Social Signal Processing, pp. 59-64, 2010.
[42] I. Patras and M. Pantic, “Particle Filtering with Factorized Likelihoods for Tracking Facial Features,” Proc. IEEE Sixth Int'l Conf. Automatic Face and Gesture Recognition, pp. 97-102, May 2004.
[43] S. Petridis, H. Gunes, S. Kaltwang, and M. Pantic, “Static vs. Dynamic Modeling of Human Nonverbal Behavior from Multiple Cues and Modalities,” Proc. Int'l Conf. Multimodal Interfaces, pp. 23-30, 2009.
[44] Y. Freund and R.E. Schapire, “A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting,” Proc. European Conf. Computational Learning Theory, pp. 23-37, 1995.

Index Terms:
visual databases,emotion recognition,emotion elicitation protocol,multimodal database,affect recognition,implicit tagging,MAHNOB-HCI,affective stimuli,emotion recognition,face video recording,audio signal recording,eye gaze data recording,peripheral-central nervous system physiological signals,arousal,valence,dominance,predictability,emotional keywords,Web-based system,Databases,Videos,Physiology,Humans,Cameras,Tagging,Emotion recognition,affective computing.,Emotion recognition,EEG,physiological signals,facial expressions,eye gaze,implicit tagging,pattern classification
M. Soleymani, J. Lichtenauer, T. Pun, M. Pantic, "A Multimodal Database for Affect Recognition and Implicit Tagging," IEEE Transactions on Affective Computing, vol. 3, no. 1, pp. 42-55, Jan.-March 2012, doi:10.1109/T-AFFC.2011.25
Usage of this product signifies your acceptance of the Terms of Use.