CSDL Home IEEE Transactions on Pattern Analysis & Machine Intelligence 2012 vol.34 Issue No.08 - Aug.
Issue No.08 - Aug. (2012 vol.34)
Weilong Yang , Dept. of Comput. Sci., Univ. of Illinois at Urbana-Champaign, Urbana, IL, USA
Yang Wang , Sch. of Comput. Sci., Simon Fraser Univ., Burnaby, BC, Canada
Tian Lan , Sch. of Comput. Sci., Simon Fraser Univ., Burnaby, BC, Canada
S. N. Robinovitch , Sch. of Eng. Sci., Simon Fraser Univ., Burnaby, BC, Canada
G. Mori , Sch. of Comput. Sci., Simon Fraser Univ., Burnaby, BC, Canada
In this paper, we go beyond recognizing the actions of individuals and focus on group activities. This is motivated from the observation that human actions are rarely performed in isolation; the contextual information of what other people in the scene are doing provides a useful cue for understanding high-level activities. We propose a novel framework for recognizing group activities which jointly captures the group activity, the individual person actions, and the interactions among them. Two types of contextual information, group-person interaction and person-person interaction, are explored in a latent variable framework. In particular, we propose three different approaches to model the person-person interaction. One approach is to explore the structures of person-person interaction. Differently from most of the previous latent structured models, which assume a predefined structure for the hidden layer, e.g., a tree structure, we treat the structure of the hidden layer as a latent variable and implicitly infer it during learning and inference. The second approach explores person-person interaction in the feature level. We introduce a new feature representation called the action context (AC) descriptor. The AC descriptor encodes information about not only the action of an individual person in the video, but also the behavior of other people nearby. The third approach combines the above two. Our experimental results demonstrate the benefit of using contextual information for disambiguating group activities.
trees (mathematics), computer vision, image motion analysis, human activity recognition, discriminative latent models, contextual group activities recognition, contextual information, individual person actions, group person interaction, person-person interaction, latent variable framework, tree structure, action context, AC, computer vision, Context, Feature extraction, Biological system modeling, Humans, Adaptation models, Vectors, Context modeling, latent structured models., Group activity recognition, context
Weilong Yang, Yang Wang, Tian Lan, S. N. Robinovitch, G. Mori, "Discriminative Latent Models for Recognizing Contextual Group Activities", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.34, no. 8, pp. 1549-1562, Aug. 2012, doi:10.1109/TPAMI.2011.228