Issue No. 04 - Fourth Quarter (2012 vol. 3)
Felix Weninger , Technische Universität München, Munich
Jarek Krajewski , Schumpeter School of Business and Economics, Bergische Universität Wuppertal, Wuppertal
Anton Batliner , Friedrich-Alexander University Erlangen-Nuremberg, Erlangen
Björn Schuller , Technische Universität München
We introduce the automatic determination of leadership emergence by acoustic and linguistic features in online speeches. Full realism is provided by the varying and challenging acoustic conditions of the presented YouTube corpus of online available speeches labeled by 10 raters and by processing that includes Long Short-Term Memory-based robust voice activity detection (VAD) and automatic speech recognition (ASR) prior to feature extraction. We discuss cluster-preserving scaling of 10 original dimensions for discrete and continuous task modeling, ground truth establishment, and appropriate feature extraction for this novel speaker trait analysis paradigm. In extensive classification and regression runs, different temporal chunkings and optimal late fusion strategies (LFSs) of feature streams are presented. In the result, achievers, charismatic speakers, and teamplayers can be recognized significantly above chance level, reaching up to 72.5 percent accuracy on unseen test data.
Linguistics, Ethics, Training, Speech recognition, YouTube, Acoustics, Pragmatics, acoustic/linguistic fusion, Personality analysis, dimensional analysis
A. Batliner, J. Krajewski, F. Weninger and B. Schuller, "The Voice of Leadership: Models and Performances of Automatic Analysis in Online Speeches," in IEEE Transactions on Affective Computing, vol. 3, no. , pp. 496-508, 2012.