$\kappa = 0.96$). When fine discrimination is required (four distinct levels) the reliability decreases, but is still quite high ($\kappa = 0.56$). Furthermore, we found that engagement labels of 10-second video clips can be reliably predicted from the average labels of their constituent frames (Pearson $r=0.85$), suggesting that static expressions contain the bulk of the information used by observers. We used machine learning to develop automatic engagement detectors and found that for binary classification (e.g., high engagement versus low engagement), automated engagement detectors perform with comparable accuracy to humans. Finally, we show that both human and automatic engagement judgments correlate with task performance. In our experiment, student post-test performance was predicted with comparable accuracy from engagement labels ($r=0.47$) as from pre-test scores ($r=0.44$)." /> $\kappa = 0.96$). When fine discrimination is required (four distinct levels) the reliability decreases, but is still quite high ($\kappa = 0.56$). Furthermore, we found that engagement labels of 10-second video clips can be reliably predicted from the average labels of their constituent frames (Pearson $r=0.85$), suggesting that static expressions contain the bulk of the information used by observers. We used machine learning to develop automatic engagement detectors and found that for binary classification (e.g., high engagement versus low engagement), automated engagement detectors perform with comparable accuracy to humans. Finally, we show that both human and automatic engagement judgments correlate with task performance. In our experiment, student post-test performance was predicted with comparable accuracy from engagement labels ($r=0.47$) as from pre-test scores ($r=0.44$)." /> $\kappa = 0.96$). When fine discrimination is required (four distinct levels) the reliability decreases, but is still quite high ($\kappa = 0.56$). Furthermore, we found that engagement labels of 10-second video clips can be reliably predicted from the average labels of their constituent frames (Pearson $r=0.85$), suggesting that static expressions contain the bulk of the information used by observers. We used machine learning to develop automatic engagement detectors and found that for binary classification (e.g., high engagement versus low engagement), automated engagement detectors perform with comparable accuracy to humans. Finally, we show that both human and automatic engagement judgments correlate with task performance. In our experiment, student post-test performance was predicted with comparable accuracy from engagement labels ($r=0.47$) as from pre-test scores ($r=0.44$)." /> The Faces of Engagement: Automatic Recognition of Student Engagementfrom Facial Expressions
Subscribe
Issue No.01 - March (2014 vol.5)
pp: 86-98
Jacob Whitehill , Machine Perception Laboratory (MPLab), University of California, San Diego, La Jolla, CA
Zewelanji Serpell , , Department of Psychology at Virginia Commonwealth University, Richmond, VA
Yi-Ching Lin , Department of Psychology, Virginia State University, Petersburg, VA
Aysha Foster , Department of Psychology, Virginia State University, Petersburg, VA
Javier R. Movellan , , MPLab and Emotient, Inc., La Jolla, CA
ABSTRACT
Student engagement is a key concept in contemporary education, where it is valued as a goal in its own right. In this paper we explore approaches for automatic recognition of engagement from students’ facial expressions. We studied whether human observers can reliably judge engagement from the face; analyzed the signals observers use to make these judgments; and automated the process using machine learning. We found that human observers reliably agree when discriminating low versus high degrees of engagement (Cohen’s $\kappa = 0.96$). When fine discrimination is required (four distinct levels) the reliability decreases, but is still quite high ($\kappa = 0.56$). Furthermore, we found that engagement labels of 10-second video clips can be reliably predicted from the average labels of their constituent frames (Pearson $r=0.85$), suggesting that static expressions contain the bulk of the information used by observers. We used machine learning to develop automatic engagement detectors and found that for binary classification (e.g., high engagement versus low engagement), automated engagement detectors perform with comparable accuracy to humans. Finally, we show that both human and automatic engagement judgments correlate with task performance. In our experiment, student post-test performance was predicted with comparable accuracy from engagement labels ($r=0.47$) as from pre-test scores ($r=0.44$).
INDEX TERMS
Training, Labeling, Games, Software, Tablet computers, Observers, Reliability,intelligent tutoring systems, Student engagement, engagement recognition, facial expression recognition, facial actions
CITATION
Jacob Whitehill, Zewelanji Serpell, Yi-Ching Lin, Aysha Foster, Javier R. Movellan, "The Faces of Engagement: Automatic Recognition of Student Engagementfrom Facial Expressions", IEEE Transactions on Affective Computing, vol.5, no. 1, pp. 86-98, March 2014, doi:10.1109/TAFFC.2014.2316163