Issue No. 10 - October (2009 vol. 31)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TPAMI.2009.83
Abhinav Gupta , University of Maryland, College Park
Aniruddha Kembhavi , University of Maryland, College Park
Larry S. Davis , University of Maryland, College Park
Interpretation of images and videos containing humans interacting with different objects is a daunting task. It involves understanding scene/event, analyzing human movements, recognizing manipulable objects, and observing the effect of the human movement on those objects. While each of these perceptual tasks can be conducted independently, recognition rate improves when interactions between them are considered. Motivated by psychological studies of human perception, we present a Bayesian approach which integrates various perceptual tasks involved in understanding human-object interactions. Previous approaches to object and action recognition rely on static shape/appearance feature matching and motion analysis, respectively. Our approach goes beyond these traditional approaches and applies spatial and functional constraints on each of the perceptual elements for coherent semantic interpretation. Such constraints allow us to recognize objects and actions when the appearances are not discriminative enough. We also demonstrate the use of such constraints in recognition of actions from static images without using any motion information.
Action recognition, object recognition, functional recognition.
L. S. Davis, A. Kembhavi and A. Gupta, "Observing Human-Object Interactions: Using Spatial and Functional Compatibility for Recognition," in IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 31, no. , pp. 1775-1789, 2009.