Recognizing Human-Object Interactions in Still Images by Modeling the Mutual Context of Objects and Human Poses
Issue No. 09 - Sept. (2012 vol. 34)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TPAMI.2012.67
Bangpeng Yao , Comput. Sci. Dept., Stanford Univ., Stanford, CA, USA
Li Fei-Fei , Comput. Sci. Dept., Stanford Univ., Stanford, CA, USA
Detecting objects in cluttered scenes and estimating articulated human body parts from 2D images are two challenging problems in computer vision. The difficulty is particularly pronounced in activities involving human-object interactions (e.g., playing tennis), where the relevant objects tend to be small or only partially visible and the human body parts are often self-occluded. We observe, however, that objects and human poses can serve as mutual context to each other-recognizing one facilitates the recognition of the other. In this paper, we propose a mutual context model to jointly model objects and human poses in human-object interaction activities. In our approach, object detection provides a strong prior for better human pose estimation, while human pose estimation improves the accuracy of detecting the objects that interact with the human. On a six-class sports data set and a 24-class people interacting with musical instruments data set, we show that our mutual context model outperforms state of the art in detecting very difficult objects and estimating human poses, as well as classifying human-object interaction activities.
Humans, Context, Estimation, Context modeling, Object detection, Biological system modeling, Sports equipment, conditional random field., Mutual context, action recognition, human pose estimation, object detection
Bangpeng Yao, Li Fei-Fei, "Recognizing Human-Object Interactions in Still Images by Modeling the Mutual Context of Objects and Human Poses", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 34, no. , pp. 1691-1703, Sept. 2012, doi:10.1109/TPAMI.2012.67