Issue No. 03 - March (2012 vol. 34)
C. Schmid , LEAR team, INRIA Rhone-Alyes, St. Ismier, France
A. Prest , Comput. Vision Lab., ETH Zurich, Zurich, Switzerland
V. Ferrari , Comput. Vision Lab., ETH Zurich, Zurich, Switzerland
We introduce a weakly supervised approach for learning human actions modeled as interactions between humans and objects. Our approach is human-centric: We first localize a human in the image and then determine the object relevant for the action and its spatial relation with the human. The model is learned automatically from a set of still images annotated only with the action label. Our approach relies on a human detector to initialize the model learning. For robustness to various degrees of visibility, we build a detector that learns to combine a set of existing part detectors. Starting from humans detected in a set of images depicting the action, our approach determines the action object and its spatial relation to the human. Its final output is a probabilistic model of the human-object interaction, i.e., the spatial relation between the human and the object. We present an extensive experimental evaluation on the sports action data set from , the PASCAL Action 2010 data set , and a new human-object interaction data set.
probability, gesture recognition, learning (artificial intelligence), object detection, action recognition, weakly supervised learning, still images, model learning, probabilistic model, human-object interaction, Humans, Detectors, Training, Face, Context modeling, Computational modeling, Support vector machines, object detection., Action recognition, weakly supervised learning
C. Schmid, A. Prest and V. Ferrari, "Weakly Supervised Learning of Interactions between Humans and Objects," in IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 34, no. , pp. 601-614, 2012.