An appealing scheme to characterize expressive behaviors is the use of emotional dimensions such as activation (calm versus active) and valence (negative versus positive). These descriptors offer many advantages to describe the wide spectrum of emotions. Due to the continuous nature of fast-changing expressive vocal and gestural behaviors, it is desirable to continuously track these emotional traces, capturing subtle and localized events (e.g., with FEELTRACE). However, timecontinuous annotations introduce challenges that affect the reliability of the labels. In particular, an important issue is the evaluators’ reaction lag caused by observing, appraising, and responding to the expressive behaviors. An empirical analysis demonstrates that this delay varies from one to six seconds, depending on the annotator, expressive dimension, and actual behaviors. Our experiments show accuracy improvements even with fixed delays (1-3 seconds). This paper proposes to compensate for this reaction lag by finding the time-shift that maximizes the mutual information between the expressive behaviors and the continuous-time annotations. The approach is implemented by making different assumptions about the evaluators’ reaction lag. The benefits of compensating for the delay is demonstrated with emotion classification experiments. On average, the classifiers trained with facial and speech features show more than 7% relative improvements over baseline classifiers trained and tested without shifting the time-continuous annotations.
Carlos Busso, "Correcting Time-Continuous Emotional Labels by Modeling the Reaction Lag of Evaluators", IEEE Transactions on Affective Computing, , no. 1, pp. 1, PrePrints PrePrints, doi:10.1109/TAFFC.2014.2334294