This Article 
 Bibliographic References 
 Add to: 
Bridging the Gap between Social Animal and Unsocial Machine: A Survey of Social Signal Processing
Jan.-March 2012 (vol. 3 no. 1)
pp. 69-87
Alessandro Vinciarelli, Univ. of Glasgow, Glasgow, UK
M. Pantic, Dept. of Comput., Imperial Coll. London, London, UK
D. Heylen, Human Media Interaction, Univ. of Twente, Enschede, Netherlands
C. Pelachaud, Telecom ParisTech, LTC1, Inst. Telecom, Paris, France
I. Poggi, Univ. Roma Tre, Rome, Italy
F. D'Errico, Univ. Roma Tre, Rome, Italy
M. Schroeder, DFKI GmbH, Saarbrucken, Germany
Social Signal Processing is the research domain aimed at bridging the social intelligence gap between humans and machines. This paper is the first survey of the domain that jointly considers its three major aspects, namely, modeling, analysis, and synthesis of social behavior. Modeling investigates laws and principles underlying social interaction, analysis explores approaches for automatic understanding of social exchanges recorded with different sensors, and synthesis studies techniques for the generation of social behavior via various forms of embodiment. For each of the above aspects, the paper includes an extensive survey of the literature, points to the most important publicly available resources, and outlines the most fundamental challenges ahead.

[1] G. Rizzolatti and L. Craighero, “The Mirror-Neuron System,” Ann. Rev. Neuroscience, vol. 27, pp. 169-192, 2004.
[2] M. Iacoboni, Mirroring People: The Science of Empathy and How We Connect with Others. Picador, 2009.
[3] C. Frith and U. Frith, “Social Cognition in Humans,” Current Biology, vol. 17, no. 16, pp. 724-732, 2007.
[4] J. Pickles, An Introduction to the Physiology of Hearing. Academic Press, 1982.
[5] B. Waller, J. Cray, and A. Burrows, “Selection for Universal Facial Emotion,” Emotion, vol. 8, no. 3, pp. 435-439, 2008.
[6] Z. Kunda, Social Cognition. MIT Press, 1999.
[7] V. Richmond and J. McCroskey, Nonverbal Behaviors in Interpersonal Relations. Allyn and Bacon, 1995.
[8] M. Pantic, A. Nijholt, A. Pentland, and T. Huang, “Human-Centred Intelligent Human-Computer Interaction (HCI$^2$ ): How Far Are We from Attaining It?” Int'l J. Autonomous and Adaptive Comm. Systems, vol. 1, no. 2, pp. 168-187, 2008.
[9] J. Crowley, “Social Perception,” ACM Queue, vol. 4, no. 6, pp. 34-43, 2006.
[10] T. Bickmore and J. Cassell, “Social Dialogue with Embodied Conversational Agents,” Advances in Natural, Multimodal, Dialogue Systems, J. van Kuppevelt, L. Dybkjaer, and N. Bernsen, eds., pp. 23-54, Kluwer, 2005.
[11] F. Wang, K. Carley, D. Zeng, and W. Mao, “Social Computing: From Social Informatics to Social Intelligence,” IEEE Intelligent Systems, vol. 22, no. 2, pp. 79-83, Mar. 2007.
[12] A. Pentland, “Socially Aware Computation and Communication,” Computer, vol. 38, no. 3, pp. 33-40, Mar. 2005.
[13] A. Vinciarelli, M. Pantic, and H. Bourlard, “Social Signal Processing: Survey of an Emerging Domain,” Image and Vision Computing J., vol. 27, no. 12, pp. 1743-1759, 2009.
[14] K. Albrecht, Social Intelligence: The New Science of Success. John Wiley & Sons Ltd., 2005.
[15] I. Poggi and F. D'Errico, “Cognitive Modelling of Human Social Signals,” Proc. IEEE Workshop Social Signal Processing, pp. 21-26, 2010.
[16] P. Brunet, H. Donnan, G. McKeown, E. Douglas-Cowie, and R. Cowie, “Social Signal Processing: What are the Relevant Variables? and in What Ways Do They Relate?” Proc. IEEE Workshop Social Signal Processing, pp. 1-6, 2009.
[17] I. Poggi, Mind, Hands, Face and Body: A Goal and Belief View of Multimodal Communication. Weidler Buchverlag, 2007.
[18] C. Nass and S. Brave, Wired for Speech: How Voice Activates and Advances the Human-Computer Relationship. MIT Press, 2005.
[19] N. Ambady, F. Bernieri, and J. Richeson, “Towards a Histology of Social Behavior: Judgmental Accuracy from Thin Slices of Behavior,” Proc. Advances in Experimental Social Psychology, pp. 201-272, 2000.
[20] U. Eco, Trattato di Semiotica Generale. Bompiani, 1975.
[21] D. Efron, Gesture and Environment. King's Crown Press, 1941.
[22] P. Ekman and W. Friesen, “The Repertoire of Nonverbal Behavior: Categories, Origins, Usage and Coding,” Semiotica, vol. 1, no. 1, pp. 49-98, 1969.
[23] R. Birdwhistell, Introduction to Kinesics, An Annotation System for Analysis of Body Motion and Gesture. Univ. of Louisville, 1952.
[24] P. Ekman, W. Friesen, and J. Hager, Facial Action Coding System (FACS): Manual. A Human Face, 2002.
[25] M. Argyle and M. Cook, Gaze and Mutual Gaze. Cambridge Univ. Press, 1976.
[26] W. Condon and W. Ogston, “A Segmentation of Behavior,” J. Psychiatric Research, vol. 5, no. 3, pp. 221-235, 1967.
[27] E. Klima and U. Bellugi, The Signs of Language. Harvard Univ. Press, 1979.
[28] L. Cerrato, “Communicative Feedback Phenomena across Languages and Modalities,” PhD dissertation, KTH Stockolm, 2007.
[29] B. Hartmann, M. Mancini, and C. Pelachaud, “Implementing Expressive Gesture Synthesis for Embodied Conversational Agents,” Proc. Seventh Int'l Gesture Workshop, pp. 188-199, 2006.
[30] G. Kreidlin, “The Dictionary of Russian Gestures,” Semantics and Pragmatics of Everyday Gestures. C. Mueller and R. Posner, eds. Weidler, 2004.
[31] A. Kendon, “On Gesture: Its Complementary Relationship with Speech,” Nonverbal Behavior and Comm., pp. 65-97, Erlbaum, 1997.
[32] D. McNeill, Hand and Mind. Univ. of Chicago Press, 1992.
[33] Pointing, S. Kita, ed. Erlbaum, 2003.
[34] R.M. Krauss, Y. Chen, and P. Chawla, “Nonverbal Behavior and Nonverbal Communication: What Do Conversational Hand Gestures Tell Us?” Advances in Experimental Social Psychology, vol. 28, pp. 389-450, 1996.
[35] G. Merola, “The Effects of the Gesture Viewpoint on the Students' Memory of Words and Stories,” Proc. Gesture Workshop, pp. 272-281, 2007.
[36] W. Levelt, Speaking from Intention to Articulation. MIT Press, 1989.
[37] H. McGurk and J. McDonalds, “Hearing Lips and Seeing Voices,” Nature, vol. 264, pp. 746-748, 1976.
[38] M. Meredith, “On the Neural Basis for Multisensory Convergence: A Brief Overview,” Cognitive Brain Research, vol. 14, no. 1, pp. 31-40, 2002.
[39] S. Campanella and P. Belin, “Integrating Face and Voice in Person Perception,” Trends in Cognitive Sciences, vol. 11, no. 12, pp. 535-543, 2007.
[40] R. Campbell, “The Processing of Audio-Visual Speech: Empirical and Neural Bases,” Philosophical Trans. Royal Soc. London—B Biological Sciences, vol. 363, no. 1493, pp. 1001-1010, 2007.
[41] R.E.A. Dolan, “Crossmodal Binding of Fear in Voice and Face,” Proc. Nat'l Academy of Sciences USA, vol. 98, pp. 10006-10010, 2001.
[42] A. Oppenheim and R. Schafer, Digital Signal Processing. Prentice-Hall, 1975.
[43] C. Shannon and R. Weaver, The Mathematical Theory of Information. Univ. of Illinois Press, 1949.
[44] O. Hasson, “Cheating Signals,” J. Theoretical Biology, vol. 167, no. 3, pp. 223-238, 1994.
[45] K. Scherer, “Vocal Communication of Emotion: A Review of Research Paradigms,” Speech Comm., vol. 40, nos. 1/2, pp. 227-256, 2003.
[46] R. Conte and C. Castelfranchi, Cognitive and Social Action. Univ. College London, 1995.
[47] M. Wertheimer, “Laws of Organization in Perceptual Forms,” A Source Book of Gestalt Psychology, W. Ellis, ed., pp. 71-88, Routledge & Kegan Paul, 1938.
[48] A. Pentland, Honest Signals: How They Shape Our World. MIT Press, 2008.
[49] E. Ahlsén, J. Lund, and J. Sundqvist, “Multimodality in Own Communication Management,” Proc. Second Nordic Conf. Multimodal Comm., pp. 43-62, 2005.
[50] C. Bazzanella, Le Facce del Parlare. La Nuova Italia, 1994.
[51] J.K. Burgoon and N.E. Dunbar, “Interpersonal Dominance as a Situationally, Interactionally, and Relationally Contingent Social Skill,” Comm. Monographs, vol. 67, no. 1, pp. 96-121, 2000.
[52] D. Heylen, “Challenges Ahead: Head Movements and Other Social Acts in Conversations,” Proc. Joint Symp. Virtual Social Agents, pp. 45-52, 2005.
[53] M. Schröder, D. Heylen, and I. Poggi, “Perception of Non-Verbal Emotional Listener Feedback,” Proc. Speech Prosody, 2006.
[54] N. Chovil, “Discourse-Oriented Facial Displays in Conversation,” Research on Language and Social Interaction, vol. 25, pp. 163-194, 1992.
[55] P. Garotti, “Disprezzo,” Introduzione alla Psicologia delle Emozioni, V. D'Urso and R. Trentin, eds., Laterza, 1998.
[56] M. Argyle, Bodily Communication. Methuen, 1988.
[57] A. Valitutti, O. Stock, and C. Strapparava, “GRAPHLAUGH: A Tool for the Interactive Generation of Humorous Puns,” Proc. Int'l Conf. Affective Computing and Intelligent Interaction, pp. 592-593, 2009.
[58] I. Poggi, F. Cavicchio, and E. Magno Caldognetto, “Irony in a Judicial Debate, Analyzing the Subtleties of Irony while Testing the Subtleties of an Annotation Scheme,” J. Language Resources and Evaluation, vol. 41, nos. 3/4, pp. 215-232, 2008.
[59] I. Poggi and F. D'Errico, “Social Signals and the Action—Cognition Loop, the Case of Overhelp and Evaluation,” Proc. IEEE Conf. Affective Computing and Intelligent Interaction, pp. 106-113, 2009.
[60] B.M. De Paulo, “Nonverbal Behavior and Self-Presentation,” Psychological Bull., vol. 111, no. 2, pp. 203-243, 1992.
[61] I. Poggi and L. Vincze, “Persuasive Gaze in Political Discourse,” Proc. Symp. Persuasive Agents, 2008.
[62] J.K. Burgoon, T. Birk, and M. Pfau, “Nonverbal Behaviors, Persuasion, and Credibility,” Human Comm. Research, vol. 17, no. 1, pp. 140-169, 1990.
[63] J. Atkinson, “Refusing Invited Applause: Preliminary Observations from a Case Study of Charismatic Oratory,” Handbook of Discourse Analysis, T. van Dijk, ed., vol. III, pp. 161-181, Academic Press, 1985.
[64] K. Bousmalis, M. Mehu, and M. Pantic, “Spotting Agreement and Disagreement: A Survey of Nonverbal Audiovisual Cues and Tools,” Proc. Int'l Conf. Affective Computing and Intelligent Interaction, vol. II, pp. 121-129, 2009.
[65] M. Lewis, “Self-Conscious Emotions: Embarrassment, Pride, Shame, and Guilt,” Handbook of Emotions, M. Lewis and J. Haviland-Jones, eds., pp. 623-636, Guilford Press, 2000.
[66] Itinerari del Rancore, R. Rizzi ed. Bollati Boringhieri, 2007.
[67] I. Poggi and V. Zuccaro, “Admiration,” Proc. AFFINE Workshop, 2008.
[68] D. Keltner, “Signs of Appeasement: Evidence for the Distinct Displays of Embarrassment, Amusement, and Shame,” J. Personality and Social Psychology, vol. 68, no. 3, pp. 441-454, 1995.
[69] M.G. Frank, P. Ekman, and W.V. Friesen, “Behavioral Markers and Recognizability of the Smile of Enjoyment,” J. Personality and Social Psychology, vol. 64, no. 1, pp. 83-93, 1993.
[70] A. Fridlund and A. Gilbert, “Emotions and Facial Expressions,” Science, vol. 230, pp. 607-608, 1985.
[71] D. Byrne, The Attraction Paradigm. Academic Press, 1971.
[72] E. Berscheid and H. Reiss, “Attraction and Close Relationships,” Handbook of Social Psychology, D. Gilbert, S. Fiske, and G. Lindzey, eds., pp. 193-281, McGraw Hill, 1997.
[73] C. Castelfranchi, “Social Power: A Missed Point in DAI, MA and HCI,” Decentralized AI, Y. Demazeau and J. Mueller, eds., pp. 49-62, Elsevier, 1990.
[74] R. Conte and M. Paolucci, Reputation in Artificial Societies. Social Beliefs for Social Order. Kluwer, 2002.
[75] M. Halliday, Il Linguaggio Come Semiotica Sociale. Zanichelli, 1983.
[76] I. McCowan, S. Bengio, D. Gatica-Perez, G. Lathoud, F. Monay, D. Moore, P. Wellner, and H. Bourlard, “Modeling Human Interaction in Meetings,” Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing, pp. 748-751, 2003.
[77] A. Waibel, T. Schultz, M. Bett, M. Denecke, R. Malkin, I. Rogina, and R. Stiefelhagen, “SMaRT: The Smart Meeting Room Task at ISL,” Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing, pp. 752-755, 2003.
[78] N. Eagle and A. Pentland, “Reality Mining: Sensing Complex Social Signals,” J. Personal and Ubiquitous Computing, vol. 10, no. 4, pp. 255-268, 2006.
[79] R. Murray-Smith, “Empowering People Rather Than Connecting Them,” Int'l J. Mobile Human Computer Interaction, vol. 3, 2009.
[80] M. Yang, D. Kriegman, and N. Ahuja, “Detecting Faces in Images: A Survey,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 1, pp. 34-58, Jan. 2002.
[81] S. Tranter and D. Reynolds, “An Overview of Automatic Speaker Diarization Systems,” IEEE Trans. Audio, Speech, and Language Processing, vol. 14, no. 5, pp. 1557-1565, Sept. 2006.
[82] D. Forsyth, O. Arikan, L. Ikemoto, J. O'Brien, and D. Ramanan, “Computational Studies of Human Motion Part 1: Tracking and Motion Synthesis,” Foundations and Trends in Computer Graphics and Vision, vol. 1, no. 2, pp. 77-254, 2006.
[83] Z. Zeng, M. Pantic, G. Roisman, and T. Huang, “A Survey of Affect Recognition Methods: Audio, Visual and Spontaneous Expressions,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 1, pp. 39-58, Jan. 2009.
[84] D. Crystal, Prosodic Systems and Intonation in English. Cambridge Univ. Press, 1969.
[85] S. Mitra and T. Acharya, “Gesture Recognition: A Survey,” IEEE Trans. Systems, Man, and Cybernetics, Part C: Applications and Rev., vol. 37, no. 3, pp. 311-324, May 2007.
[86] D. Gatica-Perez, “Automatic Nonverbal Analysis of Social Interaction in Small Groups: A Review,” Image and Vision Computing, vol. 27, no. 12, pp. 1775-1787, 2009.
[87] H. Tischler, Introduction to Sociology. Harcourt Brace College Publishers, 1990.
[88] A. Vinciarelli, “Speakers Role Recognition in Multiparty Audio Recordings Using Social Network Analysis and Duration Distribution Modeling,” IEEE Trans. Multimedia, vol. 9, no. 9, pp. 1215-1226, Oct. 2007.
[89] H. Salamin, S. Favre, and A. Vinciarelli, “Automatic Role Recognition in Multiparty Recordings: Using Social Affiliation Networks for Feature Extraction,” IEEE Trans. Multimedia, vol. 11, no. 7, pp. 1373-1380, Nov. 2009.
[90] R. Barzilay, M. Collins, J. Hirschberg, and S. Whittaker, “The Rules Behind the Roles: Identifying Speaker Roles in Radio Broadcasts,” Proc. 17th Nat'l Conf. Artificial Intelligence, pp. 679-684, 2000.
[91] Y. Liu, “Initial Study on Automatic Identification of Speaker Role in Broadcast News Speech,” Proc. Human Language Technology Conf. NAACL, Companion Volume: Short Papers, pp. 81-84, June 2006.
[92] N. Garg, S. Favre, H. Salamin, D. Hakkani-Tür, and A. Vinciarelli, “Role Recognition for Meeting Participants: An Approach Based on Lexical Information and Social Network Analysis,” Proc. ACM Int'l Conf. Multimedia, pp. 693-696, 2008.
[93] S. Favre, A. Dielmann, and A. Vinciarelli, “Automatic Role Recognition in Multiparty Recordings Using Social Networks and Probabilistic Sequential Models,” Proc. ACM Int'l Conf. Multimedia, pp. 585-588, 2009.
[94] J. Quinlan, C4. 5: Programs for Machine Learning. Morgan Kaufmann, 1993.
[95] S. Banerjee and A. Rudnicky, “Using Simple Speech Based Features to Detect the State of a Meeting and the Roles of the Meeting Participants,” Proc. Int'l Conf. Spoken Language Processing, pp. 221-231, 2004.
[96] K. Laskowski, M. Ostendorf, and T. Schultz, “Modeling Vocal Interaction for Text-Independent Participant Characterization in Multi-Party Conversation,” Proc. Ninth ISCA/ACL SIGdial Workshop Discourse and Dialogue, pp. 148-155, June 2008.
[97] M. Zancanaro, B. Lepri, and F. Pianesi, “Automatic Detection of Group Functional Roles in Face to Face Interactions,” Proc. Int'l Conf. Mutlimodal Interfaces, pp. 47-54, 2006.
[98] W. Dong, B. Lepri, A. Cappelletti, A. Pentland, F. Pianesi, and M. Zancanaro, “Using the Influence Model to Recognize Functional Roles in Meetings,” Proc. Ninth Int'l Conf. Multimodal Interfaces, pp. 271-278, Nov. 2007.
[99] H. Gunes and M. Pantic, “Automatic, Dimensional and Continuous Emotion Recognition,” Int'l J. Synthetic Emotion, vol. 1, no. 1, pp. 68-99, 2010.
[100] B. Pang and L. Lee, “Opinion Mining and Sentiment Analysis,” Foundations and Trends in Information Retrieval, vol. 2, nos. 1/2, pp. 1-135, 2008.
[101] J. Levine and R. Moreland, “Small Groups,” Handbook of Social Psychology, D. Gilbert and G. Lindzey, eds., vol. 2, pp. 415-469, Oxford Univ. Press, 1998.
[102] K. Otsuka, Y. Takemae, and J. Yamato, “A Probabilistic Inference of Multiparty-Conversation Structure Based on Markov-Switching Models of Gaze Patterns, Head Directions, and Utterances,” Proc. ACM Int'l Conf. Multimodal Interfaces, pp. 191-198, 2005.
[103] R. Rienks, D. Zhang, and D. Gatica-Perez, “Detection and Application of Influence Rankings in Small Group Meetings,” Proc. Int'l Conf. Multimodal Interfaces, pp. 257-264, 2006.
[104] R. Rienks and D. Heylen, “Dominance Detection in Meetings Using Easily Obtainable Features,” Proc. Machine Learning for Multimodal Interaction, pp. 76-86, 2006.
[105] D. Jayagopi, H. Hung, C. Yeo, and D. Gatica-Perez, “Modeling Dominance in Group Conversations from Non-Verbal Activity Cues,” IEEE Trans. Audio, Speech, and Language Processing, vol. 17, no. 3, pp. 501-513, Mar. 2009.
[106] D. Funder, “Personality,” Ann. Rev. Psychology, vol. 52, pp. 197-221, 2001.
[107] F. Pianesi, M. Zancanaro, E. Not, C. Leonardi, V. Falcon, and B. Lepri, “Multimodal Support to Group Dynamics,” Personal and Ubiquitous Computing, vol. 12, no. 3, pp. 181-195, 2008.
[108] D. Olguin-Olguin, P. Gloor, and A. Pentland, “Capturing Individual and Group Behavior with Wearable Sensors,” Proc. AAAI Spring Symp. Human Behavior Modeling, 2009.
[109] G. Mohammadi, M. Mortillaro, and A. Vinciarelli, “The Voice of Personality: Mapping Nonverbal Vocal Behavior into Trait Attributions,” Proc. Int'l Workshop Social Signal Processing, pp. 17-20, 2010.
[110] F. Mairesse, M.A. Walker, M.R. Mehl, and R.K. Moore, “Using Linguistic Cues for the Automatic Recognition of Personality in Conversation and Text,” J. Artificial Intelligence Research, vol. 30, pp. 457-500, 2007.
[111] The Five-Factor Model of Personality, J. Wiggins, ed. Guilford, 1996.
[112] J. Curhan and A. Pentland, “Thin Slices of Negotiation: Predicting Outcomes from Conversational Dynamics within the First 5 Minutes,” J. Applied Psychology, vol. 92, no. 3, pp. 802-811, 2007.
[113] J. Burgoon, L. Stern, and L. Dillman, Interpersonal Adaptation: Dyadic Interaction Patterns. Cambridge Univ. Press, 1995.
[114] T. Chartrand and J. Bargh, “The Chameleon Effect: The Perception-Behavior Link and Social Interaction,” J. Personality and Social Psychology, vol. 76, no. 6, pp. 893-910, 1999.
[115] J. Lakin, V. Jefferis, C. Cheng, and T. Chartrand, “The Chameleon Effect as Social Glue: Evidence for the Evolutionary Significance of Nonconscious Mimicry,” J. Nonverbal Behavior, vol. 27, no. 3, pp. 145-162, 2003.
[116] L. Morency, C. Sidner, C. Lee, and T. Darrell, “Head Gestures for Perceptual Interfaces: The Role of Context in Improving Recognition,” Artificial Intelligence, vol. 171, nos. 8/9, pp. 568-585, 2007.
[117] R. Murray-Smith, A. Ramsay, S. Garrod, M. Jackson, and B. Musizza, “Gait Alignment in Mobile Phone Conversations,” Proc. Int'l Conf. Human-Computer Interaction with Mobile Devices and Services, pp. 214-221, 2007.
[118] A. Vinciarelli, “Capturing Order in Social Interactions,” IEEE Signal Processing Magazine, vol. 26, no. 5, pp. 133-137, Sept. 2009.
[119] D. Hillard, M. Ostendorf, and E. Shriberg, “Detection of Agreement vs. Disagreement in Meetings: Training with Unlabeled Data,” Proc. North Am. Chapter of the Assoc. for Computational Linguistics Human Language Technology, 2003.
[120] M. Galley, K. McKeown, J. Hirschberg, and E. Shriberg, “Identifying Agreement and Disagreement in Conversational Speech: Use of Bayesian Networks to Model Pragmatic Dependencies,” Proc. Meeting Assoc. for Computational Linguistics, pp. 669-676, 2004.
[121] F. Pianesi, M. Zancanaro, E. Not, C. Leonardi, V. Falcon, and B. Lepri, “A Multimodal Annotated Corpus of Consensus Decision Making Meetings,” The J. Language Resources and Evaluation, vol. 41, nos. 3/4, pp. 409-429, 2008.
[122] D. Jayagopi, B. Raducanu, and D. Gatica-Perez, “Characterizing Conversational Group Dynamics Using Nonverbal Behaviour,” Proc. IEEE Int'l Conf. Multimedia and Expo, pp. 370-373, 2009.
[123] M. Cristani, A. Pesarin, C. Drioli, A. Perina, A. Tavano, and V. Murino, “Auditory Dialog Analysis and Understanding by Generative Modelling of Interactional Dynamics,” Proc. Int'l Workshop Computer Vision and Pattern Recognition for Human Behavior, pp. 103-109, 2009.
[124] N. Jovanovic, R. op den Akker, and A. Nijholt, “A Corpus for Studying Addressing Behaviour in Multi-Party Dialogues,” Language Resources and Evaluation, vol. 40, no. 1, pp. 5-23, 2006.
[125] R. Bakeman and J. Gottman, Observing Interaction: An Introduction to Sequential Analysis. Cambridge Univ. Press, 1986.
[126] A. Janin, D. Baron, J. Edwards, D. Ellis, D. Gelbart, N. Morgan, B. Peskin, T. Pfau, E. Shriberg, A. Stolcke, and C. Wooters, “The ICSI Meeting Corpus,” Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing, vol. I, pp. 364-367, 2003.
[127] I. McCowan, D. Gatica-Perez, S. Bengio, G. Lathoud, M. Barnard, and D. Zhang, “Automatic Analysis of Multimodal Group Actions in Meetings,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 3, pp. 305-317, Mar. 2005.
[128] J. Carletta et al., “The AMI Meeting Corpus: A Pre-Announcement,” Proc. Second Int'l Conf. Machine Learning for Multimodal Interaction, pp. 28-39, 2005.
[129] L. Chen, R. Rose, Y. Qiao, I. Kimbara, F. Parrill, H. Welji, T. Han, J. Tu, Z. Huang, M. Harper, F. Quek, Y. Xiong, D. McNeill, D. Tuttle, and T. Huang, “VACE Multimodal Meeting Corpus,” Proc. Second Int'l Conf. Machine Learning for Multimodal Interaction, pp. 40-51, 2006.
[130] N. Campbell, T. Sadanobu, M. Imura, N. Iwahashi, S. Noriko, and D. Douxchamps, “A Multimedia Database of Meetings and Informal Interactions for Tracking Participant Involvement and Discourse Flow,” Proc. Conf. Language and Resources Evaluation, pp. 391-394, 2006.
[131] A. Vinciarelli, A. Dielmann, S. Favre, and H. Salamin, “Canal9: A Database of Political Debates for Analysis of Social Interactions,” Proc. Int'l Conf. Affective Computing and Intelligent Interaction, vol. 2, pp. 96-99, 2009.
[132] S. Burger, V. MacLaren, and H. Yu, “The ISL Meeting Corpus: The Impact of Meeting Type on Speech Style,” Proc. Int'l Conf. Spoken Language Processing, pp. 301-304, 2002.
[133] J. Garofolo, C. Laprun, M. Michel, V. Stanford, and E. Tabassi, “The NIST Meeting Room Pilot Corpus,” Proc. Language Resource and Evaluation Conf., 2004.
[134] N. Mana, B. Lepri, P. Chippendale, A. Cappelletti, F. Pianesi, P. Svaizer, and M. Zancanaro, “Multimodal Corpus of Multi-Party Meetings for Automatic Social Behavior Analysis and Personality Traits Detection,” Proc. Workshop Tagging, Mining and Retrieval of Human Related Activity Information, pp. 9-14, 2007.
[135] A. Vinciarelli and M. Pantic, “Techware:, a Web Portal for Social Signal Processing,” IEEE Signal Processing Magazine, vol. 27, no. 4, pp. 142 -144, July 2010.
[136] M. Knapp and J. Hall, Nonverbal Communication in Human Interaction. Harcourt Brace College Publishers, 1972.
[137] A. Scheflen, “The Significance of Posture in Communication Systems,” Psychiatry, vol. 27, pp. 316-331, 1964.
[138] E. Hall, The Silent Language. Doubleday, 1959.
[139] H. Triandis, Culture and Social Behavior. McGraw-Hill, 1994.
[140] Nonverbal Communication: Where Nature Meets Culture, U. Segerstrale and P. Molnar, eds. Lawrence Erlbaum Assoc., 1997.
[141] M. Pantic, A. Pentland, A. Nijholt, and T. Huang, “Human Computing and Machine Understanding of Human Behavior: A Survey,” Proc. Eighth Int'l Conf. Multimodal Interfaces, vol. 4451, pp. 47-71, 2007.
[142] J. Russell, J. Bachorowski, and J. Fernandez-Dols, “Facial and Vocal Expressions of Emotion,” Ann. Rev. Psychology, vol. 54, no. 1, pp. 329-349, 2003.
[143] J. Cassell, T.W. Bickmore, M. Billinghurst, L. Campbell, K. Chang, H.H. Vilhjalmsson, and H. Yan, “Embodiment in Conversational Interfaces: Rea,” Proc. SIGCHI Conf. Human Factors in Computing Systems, pp. 520-527, 1999.
[144] E. Schegloff, “Analyzing Single Episodes of Interaction: An Exercise in Conversation Analysis,” Social Psychology Quarterly, vol. 50, no. 2, pp. 101-114, 1987.
[145] J. Cassell, C. Pelachaud, N. Badler, M. Steedman, B. Achorn, T. Becket, B. Douville, S. Prevost, and M. Stone, “Animated Conversation: Rule-Based Generation of Facial Expression, Gesture and Spoken Intonation for Multiple Conversational Agents,” Proc. 21st Ann. Conf. Computer Graphics Interactive Techniques, pp. 413-420, 1994.
[146] K. Thórisson, “Natural Turn-Taking Needs No Manual,” Multimodality in Language and Speech Systems, I.K.B. Granström and D. House, ed., pp. 173-207, Kluwer Academic Publishers, 2002.
[147] J. Bonaiuto and K.R. Thórisson, “Towards a Neurocognitive Model of Realtime Turntaking in Face-to-Face Dialogue,” Embodied Comm. in Humans And Machines, G.K.I. Wachsmuth and M. Lenzen, ed., pp. 451-484, Oxford Univ. Press, 2008.
[148] K. Prepin and A. Revel, “Human-Machine Interaction as a Model of Machine-Machine Interaction: How to Make Machines Interact as Humans Do,” Advanced Robotics, vol. 21, no. 15, pp. 1709-1723, 2007.
[149] R.M. Maatman, J. Gratch, and S. Marsella, “Natural Behavior of a Listening Agent,” Proc. Int'l Conf. Intelligent Virtual Agents, pp. 25-36, 2005.
[150] L. Morency, I. de Kok, and J. Gratch, “Predicting Listener Backchannels: A Probabilistic Multimodal Approach,” Proc. Int'l Conf. Intelligent Virtual Agents, pp. 176-190, 2008.
[151] S. Kopp, T. Stocksmeier, and D. Gibbon, “Incremental Multimodal Feedback for Conversational Agents,” Proc. Eighth Int'l Conf. Intelligent Virtual Agents, pp. 139-146, 2007.
[152] J. Urbain, S. Dupont, T. Dutoit, R. Niewiadomski, and C. Pelachaud, “Towards a Virtual Agent Using Similarity-Based Laughter Production,” Proc. Interdisciplinary Workshop Laughter and Other Interactional Vocalisations in Speech, 2009.
[153] S. Sundaram and S. Narayanan, “Automatic Acoustic Synthesis of Human-Like Laughter,” J. Acoustical Soc. Am., vol. 121, no. 1, pp. 527-535, 2007.
[154] S. Pammi and M. Schröder, “Annotating Meaning of Listener Vocalizations for Speech Synthesis,” Proc. Int'l Conf. Affective Computing and Intelligent Interaction, pp. 453-458, 2009.
[155] J. Trouvain and M. Schröder, “How (Not) to Add Laughter to Synthetic Speech,” Proc. Workshop Affective Dialogue Systems, pp. 229-232, 2004.
[156] M. Schröder, “Experimental Study of Affect Bursts,” Speech Comm. Special issue speech and emotion, vol. 40, nos. 1/2, pp. 99-116, 2003.
[157] K.R. Scherer, “Affect Bursts,” Emotions: Essays on Emotion Theory, S.H.M. van Goozen, N.E. van de Poll, and J.A. Sergeant, eds., pp. 161-193, Lawrence Erlbaum, 1994.
[158] P. Ekman, “Biological and Cultural Contributions to Body and Facial Movement,” Anthropology of the Body, J. Blacking, ed., pp. 39-84, Academic Press, 1977.
[159] A. Kendon, Conducting Interaction: Pattern of Behavior in Focused Encounter. Cambridge Univ. Press, 1990.
[160] A. Scheflen, Body Language and Social Order. Prentice-Hall, Inc., 1973.
[161] C. Pedica and H.H. Vilhjálmsson, “Spontaneous Avatar Behavior for Human Territoriality,” Proc. Int'l Conf. Intelligent Virtual Agents, pp. 344-357, 2009.
[162] D. Jan and D.R. Traum, “Dynamic Movement and Positioning of Embodied Agents in Multiparty Conversations,” Proc. Sixth Int'l Joint Conf. Autonomous Agents and Multiagent Systems, 2007.
[163] D. Helbing and P. Molnár, “Social Force Model for Pedestrian Dynamics,” Physical Rev. E, vol. 51, no. 5, pp. 4282-4287, 1995.
[164] M. Schröder, “Expressing Degree of Activation in Synthetic Speech,” IEEE Trans. Audio, Speech, and Language Processing, vol. 14, no. 4, pp. 1128-1136, July 2006.
[165] J. Trouvain, S. Schmidt, M. Schröder, M. Schmitz, and W.J. Barry, “Modelling Personality Features by Changing Prosody in Synthetic Speech,” Proc. Speech Prosody, 2006.
[166] F. Burkhardt and W.F. Sendlmeier, “Verification of Acoustical Correlates of Emotional Speech Using Formant Synthesis,” Proc. ISCA Workshop Speech and Emotion, pp. 151-156, 2000.
[167] M. Walker, J. Cahn, and S. Whittaker, “Linguistic Style Improvisation for Lifelike Computer Characters,” AAAI Workshop Entertainment and AI/A-Life, aAAI Technical Report WS-96-03, 1996.
[168] S. Gupta, M.A. Walker, and D.M. Romano, “Generating Politeness in Task Based Interaction: An Evaluation of the Effect of Linguistic form and Culture,” Proc. 11th European Workshop Natural Language Generation, pp. 57-64, 2007.
[169] E. André, M. Rehm, W. Minker, and D. Buhler, “Endowing Spoken Language Dialogue Systems with Emotional Intelligence,” Proc. Affective Dialogue Systems, pp. 178-187, 2004.
[170] L. Johnson, P. Rizzo, W. Bosma, M. Ghijsen, and H. van Welbergen, “Generating Socially Appropriate Tutorial Dialog,” Proc. Affective Dialogue Systems, pp. 254-264, 2004.
[171] K. Porayska-Pomsta and C. Mellish, “Modelling Politeness in Natural Language Generation,” Proc. Int'l Conf. Natural Language Generation, pp. 141-150, 2004.
[172] P. Brown and S.C. Levinson, Politeness—Some Universals in Language Usage. Cambridge Univ. Press, 1987.
[173] M. de Jong, M. Theune, and D. Hofs, “Politeness and Alignment in Dialogues with a Virtual Guide,” Proc. Seventh Int'l Conf. Autonomous Agents and Multiagent Systems, pp. 207-214, 2008.
[174] M. Rehm and E. André, “Informing the Design of Embodied Conversational Agents by Analysing Multimodal Politeness Behaviors in Human-Human Communication,” Proc. Workshop Conversational Informatics for Supporting Social Intelligence and Interaction, 2005.
[175] R. Niewiadomski and C. Pelachaud, “Model of Facial Expressions Management for an Embodied Conversational Agent,” Proc. Second Int'l Conf. Affective Computing and Intelligent Interaction, pp. 12-23.
[176] H. Prendinger and M. Ishizuka, “Social Role Awareness in Animated Agents,” Proc. Int'l Conf. Autonomous Agents, pp. 270-277, 2001.
[177] M. Schröder, “Expressive Speech Synthesis: Past, Present, and Possible Futures,” Affective Information Processing, J. Tao and T. Tan, eds., pp. 111-126, Springer, 2009.
[178] A. Hunt and A.W. Black, “Unit Selection in a Concatenative Speech Synthesis System Using a Large Speech Database,” Proc. IEEE Int'l Conf. Audio, Speech, and Signal Processing, vol. 1, pp. 373-376, 1996.
[179] J.F. Pitrelli, R. Bakis, E.M. Eide, R. Fernandez, W. Hamza, and M.A. Picheny, “The IBM Expressive Text-to-Speech Synthesis System for American English,” IEEE Trans. Audio, Speech, and Language Processing, vol. 14, no. 4, pp. 1099-1108, July 2006.
[180] W.L. Johnson, S.S. Narayanan, R. Whitney, R. Das, M. Bulut, and C. LaBore, “Limited Domain Synthesis of Expressive Military Speech for Animated Characters,” Proc. IEEE Workshop Speech Synthesis, pp. 163-166, 2002.
[181] P. Gebhard, M. Schröder, M. Charfuelan, C. Endres, M. Kipp, S. Pammi, M. Rumpler, and O. Türk, “IDEAS4Games: Building Expressive Virtual Characters for Computer Games,” Proc. Intelligent Virtual Agents, pp. 426-440, 2008.
[182] E. Zovato, A. Pacchiotti, S. Quazza, and S. Sandri, “Towards Emotional Speech Synthesis: A Rule Based Approach,” Proc. ISCA Speech Synthesis Workshop, pp. 219-220, 2004.
[183] O. Türk and M. Schröder, “A Comparison of Voice Conversion Methods for Transforming Voice Quality in Emotional Speech Synthesis,” Proc. Interspeech, 2008.
[184] M. Schröder, “Interpolating Expressions in Unit Selection,” Proc. Int'l Conf. Affective Computing and Intelligent Interaction, pp. 718-720, 2007.
[185] T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, “Simultaneous Modeling of Spectrumc Pitch and Duration in HMM-Based Speech Synthesis,” Proc. Eurospeech, 1999.
[186] J. Yamagishi, K. Onishi, T. Masuko, and T. Kobayashi, “Modeling of Various Speaking Styles and Emotions for HMM-Based Speech Synthesis,” Proc. Eurospeech, pp. 2461-2464, 2003.
[187] T. Nose, J. Yamagishi, and T. Kobayashi, “A Style Control Technique for Speech Synthesis Using Multiple Regression HSMM,” Proc. Interspeech, pp. 1324-1327, 2006.
[188] K. Miyanaga, T. Masuko, and T. Kobayashi, “A Style Control Technique for HMM-Based Speech Synthesis,” Proc. Eighth Int'l Conf. Spoken Language Processing, pp. 1437-1440, 2004.
[189] J. Yamagishi, T. Kobayashi, M. Tachibana, K. Ogata, and Y. Nakano, “Model Adaptation Approach to Speech Synthesis with Diverse Voices and Styles,” Proc. Int'l Conf. Audio, Speech and Signal Processing, vol. IV, pp. 1233-1236, 2007.
[190] R. Fernandez and B. Ramabhadran, “Automatic Exploration of Corpus-Specific Properties for Expressive Text-to-Speech: A Case Study in Emphasis,” Proc. Sixth ISCA Workshop Speech Synthesis, pp. 34-39, 2007.
[191] V. Strom, A. Nenkova, R. Clark, Y. Vazquez-Alvarez, J. Brenier, S. King, and D. Jurafsky, “Modelling Prominence and Emphasis Improves Unit-Selection Synthesis,” Proc. Interspeech, pp. 1282-1285, 2007.
[192] Ruth,, 2012.
[193] Cadia,, 2012.
[194] Greta,, 2012.
[195] SmartBody, http:/, 2012.
[196] H. Vilhjalmsson, N. Cantelmo, J. Cassell, N.E. Chafai, M. Kipp, S. Kopp, M. Mancini, S. Marsella, A.N. Marshall, C. Pelachaud, Z. Ruttkay, K.R. Thórisson, H. van Welbergen, and R. van der Werf, “The Behavior Markup Language: Recent Developments and Challenges,” Proc. Int'l Conf. Intelligent Virtual Agents, pp. 99-111, Sept. 2007.
[197] D. Heylen, S. Kopp, S. Marsella, C. Pelachaud, and H. Vilhjalmsson, “Why Conversational Agents Do What They Do? Functional Representations for Generating Conversational Agent Behavior,” Proc. Seventh Int'l Conf. Autonomous Agents and Multiagent Systems, 2008.
[198] BML, http://wiki.mindmakers.orgprojects:bml:main , 2012.
[199] M. Schröder, “The SEMAINE API: Towards a Standards-Based Framework for Building Emotion-oriented Systems,” Advances in Human-Computer Interaction, pp. 319-406, 2010.
[200] Festival, /, 2012.
[201] OPENMARY, http:/, 2012.
[202] EULER,, 2012.
[203] M. Gladwell, Blink: The Power of Thinking without Thinking. Little Brown & Company, 2005.
[204] S.E. Hyman, “A New Image for Fear and Emotion,” Nature, vol. 393, pp. 417-418, 1998.
[205] E. Douglas-Cowie, L. Devillers, J.C. Martin, R. Cowie, S. Savvidou, S. Abrilian, and C. Cox, “Multimodal Databases of Everyday Emotion: Facing Up To Complexity,” Proc. Interspeech, pp. 813-816, 2005.
[206] G. Hofer, J. Yamagishi, and H. Shimodaira, “Speech-Driven Lip Motion Generation with a Trajectory HMM,” Proc. Interspeech, pp. 2314-2317, 2008.
[207] J. Bates, “The Role of Emotion in Believable Agents,” Comm. ACM, vol. 37, no. 7, pp. 122-125, 1994.
[208] J.S. Uleman, L.S. Newman, and G.B. Moskowitz, “People as Flexible Interpreters: Evidence and Issues from Spontaneous Trait Inference,” Advances in Experimental Social Psychology, M.P. Zanna, ed., vol. 28, pp. 211-279, Elsevier, 1996.
[209] J.S. Uleman, S.A. Saribay, and C.M. Gonzalez, “Spontaneous Inferences, Implicit Impressions, and Implicit Theories,” Ann. Rev. Psychology, vol. 59, pp. 329-360, 2008.
[210] K. Isbister and C. Nass, “Consistency of Personality in Interactive Characters: Verbal Cues, Non-Verbal Cues, and User Characteristics,” Int'l J. Human-Computers Studies, vol. 53, no. 2, pp. 251-267, 2000.
[211] B. de Gelder and J. Vroomen, “The Perception of Emotions by Ear and by Eye,” Cognition and Emotion, vol. 14, no. 3, pp. 289-311, 2000.
[212] M. Mori, “The Uncanny Valley,” Energy, vol. 7, no. 4, pp. 33-35, 1970.
[213] T.B. Moeslund, A. Hilton, and V. Krüger, “A Survey of Advances in Vision-Based Human Motion Capture and Analysis,” Computer Vision and Image Understanding, vol. 104, nos. 2/3, pp. 90-126, 2006.
[214] M.E. Foster, “Comparing Rule-Based and Data-Driven Selection of Facial Displays,” Proc. Workshop Embodied Language Processing, pp. 1-8, 2007.
[215] M. Buchanan, “The Science of Subtle Signals,” Strategy+Business, vol. 48, pp. 68-77, 2007.
[216] K. Greene, “10 Emerging Technologies 2008,” MIT Technology Rev., Feb. 2008.
[217] S. Dumais, E. Cutrell, J. Cadiz, G. Jancke, R. Sarin, and D. Robbins, “Stuff I've Seen: A System for Personal Information Retrieval and Re-Use,” Proc. 26th Ann. Int'l ACM SIGIR Conf. Research and Development in Informaion Retrieval, pp. 72-79, 2003.
[218] C. Weng, W. Chu, and J. Wu, “Rolenet: Movie Analysis from the Perspective of Social Networks,” IEEE Trans. Multimedia, vol. 11, no. 2, pp. 256-271, Feb. 2009.
[219] M. Pantic and A. Vinciarelli, “Implicit Human Centered Tagging,” IEEE Signal Processing Magazine, vol. 26, no. 6, pp. 173-180, Nov. 2009.
[220] I. Arapakis, Y. Moshfeghi, H. Joho, R. Ren, D. Hannah, and J. Jose, “Integrating Facial Expressions Into User Profiling for the Improvement of a Multimodal Recommender System,” Proc. IEEE Int'l Conf. Multimedia and Expo, pp. 1440-1443, 2009.
[221] I. Arminen and A. Weilenmann, “Mobile Presence and Intimacy—Reshaping Social Actions in Mobile Contextual Configuration,” J. Pragmatics, vol. 41, no. 10, pp. 1905-1923, 2009.
[222] M. Raento, A. Oulasvirta, and N. Eagle, “Smartphones: An Emerging Tool for Social Scientists,” Sociological Methods and Research, vol. 37, no. 3, pp. 426-454, 2009.
[223] S. Strachan and R. Murray-Smith, “Nonvisual, Distal Tracking of Remote Agents in Geosocial Interaction,” Proc. Symp. Location and Context Awareness, 2009.
[224] H. Ishii and M. Kobayashi, “ClearBoard: A Seamless Medium for Shared Drawing and Conversation with Eye Contact,” Proc. SIGCHI Conf. Human Factors in Computing Systems, pp. 525-532, 1992.
[225] J. Bailenson and N. Yee, “Virtual Interpersonal Touch and Digital Chameleons,” J. Nonverbal Behavior, vol. 31, no. 4, pp. 225-242, 2007.
[226] N. Ambady, M. Krabbenhoft, and D. Hogan, “The 30-Sec Sale: Using Thin-Slice Judgments to Evaluate Sales Effectiveness,” J. Consumer Psychology, vol. 16, no. 1, pp. 4-13, 2006.
[227] A. Chattopadhyay, D. Dahl, R. Ritchie, and K. Shahin, “Hearing Voices: The Impact of Announcer Speech Characteristics on Consumer Next Term Response to Broadcast Advertising,” J. Consumer Psychology, vol. 13, no. 3, pp. 198-204, 2003.
[228] D. Wooten and A. Reed II, “A Conceptual Overview of the Self-Presentational Concerns and Response Tendencies of Focus Group Participants,” J. Consumer Psychology, vol. 9, no. 3, pp. 141-153, 2000.
[229] W. Breitfuss, H. Prendinger, and M. Ishizuka, “Automatic Generation of Non-Verbal Behavior for Agents in Virtual Worlds: A System for Supporting Multimodal Conversations of Bots and Avatars,” Proc. 3D Int'l Conf. Online Communities and Social Computing, pp. 153-161, 2009.
[230] A. Sagae, B. Wetzel, A. Valente, and W.L. Johnson, “Culture-Driven Response Strategies for Virtual Human Behavior in Training Systems,” Proc. Speech and Language Technologies in Education, 2009.
[231] K. Dautenhahn, “Socially Intelligent Robots: Dimensions of Humanrobot Interaction,” Philosophical Trans. Royal Soc. B, vol. 362, pp. 679-704, 2007.

Index Terms:
human computer interaction,behavioural sciences,social behavior generation,social animal,unsocial machine,social signal processing,social intelligence gap,social interaction,social exchanges,synthesis studies techniques,Humans,Signal processing,Face,Animals,Electronic mail,Context,social interactions understanding.,Social signal processing,nonverbal behavior analysis and synthesis
Alessandro Vinciarelli, M. Pantic, D. Heylen, C. Pelachaud, I. Poggi, F. D'Errico, M. Schroeder, "Bridging the Gap between Social Animal and Unsocial Machine: A Survey of Social Signal Processing," IEEE Transactions on Affective Computing, vol. 3, no. 1, pp. 69-87, Jan.-March 2012, doi:10.1109/T-AFFC.2011.27
Usage of this product signifies your acceptance of the Terms of Use.