The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.04 - Fourth Quarter (2012 vol.3)
pp: 496-508
Felix Weninger , Technische Universität München, Munich
Jarek Krajewski , Schumpeter School of Business and Economics, Bergische Universität Wuppertal, Wuppertal
Anton Batliner , Friedrich-Alexander University Erlangen-Nuremberg, Erlangen
Björn Schuller , Technische Universität München
ABSTRACT
We introduce the automatic determination of leadership emergence by acoustic and linguistic features in online speeches. Full realism is provided by the varying and challenging acoustic conditions of the presented YouTube corpus of online available speeches labeled by 10 raters and by processing that includes Long Short-Term Memory-based robust voice activity detection (VAD) and automatic speech recognition (ASR) prior to feature extraction. We discuss cluster-preserving scaling of 10 original dimensions for discrete and continuous task modeling, ground truth establishment, and appropriate feature extraction for this novel speaker trait analysis paradigm. In extensive classification and regression runs, different temporal chunkings and optimal late fusion strategies (LFSs) of feature streams are presented. In the result, achievers, charismatic speakers, and teamplayers can be recognized significantly above chance level, reaching up to 72.5 percent accuracy on unseen test data.
INDEX TERMS
Linguistics, Ethics, Training, Speech recognition, YouTube, Acoustics, Pragmatics, acoustic/linguistic fusion, Personality analysis, dimensional analysis
CITATION
Felix Weninger, Jarek Krajewski, Anton Batliner, Björn Schuller, "The Voice of Leadership: Models and Performances of Automatic Analysis in Online Speeches", IEEE Transactions on Affective Computing, vol.3, no. 4, pp. 496-508, Fourth Quarter 2012, doi:10.1109/T-AFFC.2012.15
REFERENCES
[1] T.A. Judge and R.F. Piccolo, "Transformational and Transactional Leadership: A Meta-Analytic Test of Their Relative Validity," J. Applied Psychology, vol. 89, pp. 755-768, 2004.
[2] J. Kuoppala, A. Lamminpää, J. Liira, and H. Vainio, "Leadership, Job Well-Being, and Health Effects—A Systematic Review and a Meta-Analysis," J. Occupational and Environmental Medicine, vol. 50, no. 8, pp. 904-915, 2008.
[3] M. Van Vugt, R. Hogan, and R.B. Kaiser, "Leadership, Followership, and Evolution—Some Lessons from the Past," Am. Psychologist, vol. 63, pp. 182-196, 2008.
[4] R. Hogan and R. Kaiser, "What We Know about Leadership," Rev. General Psychology, vol. 9, pp. 901-910, 2005.
[5] R.R. McCrae and O.P. John, "An Introduction to the Five-Factor Model and Its Applications," J. Personality, vol. 60, pp. 175-215, 1992.
[6] K.R. Scherer, "Personality Markers in Speech," Social Markers in Speech, K.R. Scherer and H. Giles, eds., pp. 147-209, Cambridge Univ. Press, 1979.
[7] F. Mairesse, M.A. Walker, M.R. Mehl, and R.K. Moore, "Using Linguistic Cues for the Automatic Recognition of Personality in Conversation and Text," J. Artificial Intelligence Research, vol. 30, pp. 457-500, 2007.
[8] K. Laskowski, M. Ostendorf, and T. Schultz, "Modeling Vocal Interaction for Text-Independent Participant Characterization in Multi-Party Conversation," Proc. Ninth SIGdial Workshop Discourse and Dialogue, pp. 148-155, 2008.
[9] A. Rosenberg and J. Hirschberg, "Acoustic/Prosodic and Lexical Correlates of Charismatic Speech," Proc. Interspeech, pp. 513-516, 2005.
[10] S.W. Gregory and T.J. Gallagher, "Spectral Analysis of Candidates' Nonverbal Vocal Communication: Predicting U.S. Presidential Election Outcomes," Social Psychology Quarterly, vol. 65, pp. 298-308, 2002.
[11] C. Nass and K.M. Lee, "Does Computer-Synthesized Speech Manifest Personality? Experimental Tests of Recognition, Similarity-Attraction, and Consistency-Attraction," J. Experimental Psychology: Applied, vol. 7, pp. 171-181, 2001.
[12] S. Argamon, S. Dawle, M. Koppel, and J. Pennebaker, "Lexical Predictors of Personality Type," Proc. Joint Ann. Meeting of the Interface and the Classification Soc. of North Am., 2005.
[13] J. Oberlander and S. Nowson, "Whose Thumb Is It Anyway? Classifying Author Personality from Weblog Text," Proc. COLING/ACL Main Conf. Poster Sessions, pp. 627-634, 2006.
[14] G. Mohammadi, M. Mortillaro, and A. Vinciarelli, "The Voice of Personality: Mapping Nonverbal Vocal Behavior into Trait Attributions," Proc. Int'l Workshop. Social Signal Processing, pp. 17-20, 2010.
[15] F. Metze, A. Black, and T. Polzehl, "A Review of Personality in Voice-Based Man Machine Interaction," Proc. 14th Int'l Conf. Human-Computer Interaction: Interaction Techniques and Environments, pp. 358-367, 2011.
[16] J.E. Bono and T.A. Judge, "Personality and Transformational and Transactional Leadership: A Meta-Analysis," J. Applied Psychology, vol. 89, pp. 901-910, 2004.
[17] J.W. Pennebaker and T.C. Lay, "Language Use and Personality during Crises: Analyses of Mayor Rudolph Giuliani's Press Conferences," J. Research in Personality, vol. 36, no. 3, pp. 271-282, 2002.
[18] R.J. House, P.J. Hanges, M. Javidan, P.W. Dorfman, and V. Gupta, Culture, Leadership, and Organizations: The GLOBE Study of 62 Societies. Sage Publications, 2004.
[19] M.V. Grachev and M.A. Bobina, "Russian Organizational Leadership—Lessons from the GLOBE Study," Int'l J. Leadership Studies, vol. 1, no. 2, pp. 67-79, 2006.
[20] J.A. Irving, "Educating Global Leaders: Exploring Intercultural Competence in Leadership Education," J. Int'l Business and Cultural Studies, vol. 3, no. 1, pp. 30-49, 2010.
[21] M.W. Dickson, D.N. Den Hartog, and J.K. Mitchelson, "Research on Leadership in a Cross-Cultural Context: Making Progress, and Raising New Questions," The Leadership Quarterly, vol. 14, pp. 729-768, 2003.
[22] A.M. Bertsch, "Validating GLOBE Scales: A Test in the U.S.A," Proc. Cambridge Business and Economics Conf., p. 31, 2011.
[23] S. Joy and D.A. Kolb, "Are There Cultural Differences in Learning Style?" Int'l J. Intercultural Relations, vol. 33, no. 1, pp. 69-85, 2009.
[24] J. Mansour, R.J. House, P.W. Dorfman, P.J. Hanges, and M.S. de Luque, "Conceptualizing and Measuring Cultures and Their Consequences: A Comparative Review of GLOBE's and Hofstede's Approaches," J. Int'l Business Studies, vol. 37, pp. 897-914, 2006.
[25] M. Grimm and K. Kroschel, "Evaluation of Natural Emotions Using Self Assessment Manikins," Proc. IEEE Automatic Speech Recognition and Understanding Conf., pp. 381-385, 2005.
[26] J.M. Chambers, W.S. Cleveland, B. Kleiner, and P.A. Tukey, Graphical Methods for Data Analysis. Wadsworth & Brooks/Cole, 1983.
[27] J. Kruskal and M. Wish, Multidimensional Scaling. Sage Univ., 1978.
[28] G. Yukl, Leadership in Organizations, sixth ed. Pearson-Prentice Hall, 2006.
[29] K. Krippendorff, Content Analysis: An Introduction to Its Methodology. Sage Publications, 2004.
[30] A. Batliner, R. Kompe, A. Kießling, M. Mast, H. Niemann, and E. Nöth, "M=Syntax+Prosody: A Syntactic-Prosodic Labelling Scheme for Large Spontaneous Speech Databases," Speech Comm., vol. 25, no. 4, pp. 193-222, Sept. 1998.
[31] A. Graves, "Supervised Sequence Labelling with Recurrent Neural Networks," PhD dissertation, Technische Universität München, 2008.
[32] F. Weninger, J. Geiger, M. Wöllmer, B. Schuller, and G. Rigoll, "The Munich 2011 CHiME Challenge Contribution: NMF-BLSTM Speech Enhancement and Recognition for Reverberated Multisource Environments," Proc. CHiME Workshop, pp. 24-29, 2011.
[33] D. Pearce and H.-G. Hirsch, "The Aurora Experimental Framework for the Performance Evaluation of Speech Recognition Systems under Noisy Conditions," Proc. Automatic Speech Recognition, pp. 181-188, 2000.
[34] F. Eyben, M. Wöllmer, and B. Schuller, "Opensmile—The Munich Versatile and Fast Open-Source Audio Feature Extractor," Proc. ACM Multimedia, pp. 1459-1462, Oct. 2010.
[35] S. Steidl, B. Schuller, A. Batliner, and D. Seppi, "The Hinterland of Emotions: Facing the Open-Microphone Challenge," Proc. Third Int'l Conf. Affective Computing and Intelligent Interaction and Workshops, pp. 690-697, Sept. 2009.
[36] B. Schuller, A. Batliner, S. Steidl, and D. Seppi, "Recognising Realistic Emotions and Affect in Speech: State of the Art and Lessons Learned from the First Challenge," Speech Comm., vol. 53, nos. 9/10, pp. 1062-1087, 2011.
[37] B. Schuller, S. Steidl, A. Batliner, F. Burkhardt, L. Devillers, C. Müller, and S. Narayanan, "The INTERSPEECH 2010 Paralinguistic Challenge," Proc. 13th Int'l Speech Comm. Assoc., pp. 2794-2797, Sept. 2010.
[38] B. Schuller, A. Batliner, S. Steidl, F. Schiel, and J. Krajewski, "The INTERSPEECH 2011 Speaker State Challenge," Proc. Int'l Speech Comm. Assoc., pp. 3201-3204, 2011.
[39] B. Schuller, S. Steidl, A. Batliner, E. Nth, A. Vinciarelli, F. Burkhardt, R. van Son, F. Weninger, F. Eyben, T. Bocklet, G. Mohammadi, and B. Weiss, "The INTERSPEECH 2012 Speaker Trait Challenge," Proc. 13th Ann. Conf. Int'l Speech Comm. Assoc., Sept. 2012.
[40] B. Schuller, M. Valstar, R. Cowie, and M. Pantic, "AVEC 2012— The Continuous Audio/Visual Emotion Challenge," Proc. Second Int'l Audio/Visual Emotion Challenge and Workshop, Grand Challenge and Satellite of ACM ICMI '12, Oct. 2012.
[41] B. Weiss and F. Burkhardt, "Voice Attributes Affecting Likability Perception," Proc. Int'l Speech Comm. Assoc., pp. 2014-2017, 2010.
[42] S.J. Young, G. Evermann, M.J.F. Gales, D. Kershaw, G. Moore, J.J. Odell, D.G. Ollason, D. Povey, V. Valtchev, and P.C. Woodland, The HTK Book, version 3.4, Cambridge Univ. Eng. Dept., 2006.
[43] M.A. Pitt, L. Dilley, K. Johnson, S. Kiesling, W. Raymond, E. Hume, and E. Fosler-Lussier, Buckeye Corpus of Conversational Speech (Second Release), Dept. of Psychology, Ohio State Univ. (distributor), www.buckeyecorpus.osu.edu, 2007.
[44] F. Weninger, B. Schuller, M. Wöllmer, and G. Rigoll, "Localization of Non-Linguistic Events in Spontaneous Speech by Non-Negative Matrix Factorization and Long Short-Term Memory," Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing, pp. 5840-5843, 2011.
[45] S. Agarwal, S. Godbole, D. Punjani, and S. Roy, "How Much Noise Is Too Much: A Study in Automatic Text Classification," Proc. IEEE Seventh Int'l Conf. Data Mining, pp. 3-12, 2007.
[46] J.C. Platt, "Fast Training of Support Vector Machines Using Sequential Minimal Optimization," Advances in Kernel Methods: Support Vector Learning, pp. 185-208, MIT Press, 1999.
[47] T. Joachims, "Text Categorization with Support Vector Machines: Learning with Many Relevant Features," Proc. 10th European Conf. Machine Learning, pp. 137-142, 1998.
[48] T.K. Ho, "The Random Subspace Method for Constructing Decision Forests," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 8, pp. 832-844, Aug. 1998.
[49] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I.H. Witten, "The WEKA Data Mining Software: An Update," ACM SIGKDD Explorations Newsletter, vol. 11, no. 1, pp. 10-18, 2009.
[50] I.H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques, second ed. Morgan Kaufman, 2005.
[51] T.G. Dietterich, "Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms," Neural Computation, vol. 10, pp. 1895-1923, 1998.
[52] L. Gillick and S.J. Cox, "Some Statistical Issues in the Comparison of Speech Recognition Algorithms," Proc. Int'l Conf. Acoustics, Speech, and Signal Processing, pp. 532-535, 1989.
[53] H. Eysenck, "The Concept of Statistical Significance and the Controversy about One-Tailed Tests," Psychological Rev., vol. 67, pp. 269-271, 1960.
42 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool