|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| Bangpeng Yao, Li Fei-Fei, "Recognizing Human-Object Interactions in Still Images by Modeling the Mutual Context of Objects and Human Poses," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 9, pp. 1691-1703, Sept., 2012. | |||
| BibTex | x | ||
| @article{ 10.1109/TPAMI.2012.67, author = { Bangpeng Yao and Li Fei-Fei}, title = {Recognizing Human-Object Interactions in Still Images by Modeling the Mutual Context of Objects and Human Poses}, journal ={IEEE Transactions on Pattern Analysis and Machine Intelligence}, volume = {34}, number = {9}, issn = {0162-8828}, year = {2012}, pages = {1691-1703}, doi = {http://doi.ieeecomputersociety.org/10.1109/TPAMI.2012.67}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Pattern Analysis and Machine Intelligence TI - Recognizing Human-Object Interactions in Still Images by Modeling the Mutual Context of Objects and Human Poses IS - 9 SN - 0162-8828 SP1691 EP1703 EPD - 1691-1703 A1 - Bangpeng Yao, A1 - Li Fei-Fei, PY - 2012 KW - pose estimation KW - computer vision KW - object detection KW - musical instruments data set KW - human-object interactions recognition KW - still images KW - mutual context modeling KW - object detection KW - cluttered scenes KW - articulated human body parts estimation KW - 2D images KW - computer vision KW - human pose estimation KW - six-class sports data set KW - 24-class people KW - Humans KW - Context KW - Estimation KW - Context modeling KW - Object detection KW - Biological system modeling KW - Sports equipment KW - conditional random field. KW - Mutual context KW - action recognition KW - human pose estimation KW - object detection VL - 34 JA - IEEE Transactions on Pattern Analysis and Machine Intelligence ER - | |||
[1] I. Biederman, R. Mezzanotte, and J. Rabinowitz, "Scene Perception: Detecting and Judging Objects Undergoing Relational Violations," Cognitive Psychology, vol. 14, pp. 143-177, 1982.
[2] A. Oliva and A. Torralba, "The Role of Context in Object Recognition," Trends in Cognitive Sciences, vol. 11, no. 12, pp. 520-527, 2007.
[3] A. Rabinovich, A. Vedaldi, C. Galleguillos, E. Wiewiora, and S. Belongie, "Objects in Context," Proc. 11th IEEE Int'l Conf. Computer Vision, 2007.
[4] G. Heitz and D. Koller, "Learning Spatial Context: Using Stuff to Find Things," Proc. European Conf. Computer Vision, 2008.
[5] S. Divvala, D. Hoiem, J. Hays, A. Efros, and M. Hebert, "An Empirical Study of Context in Object Detection," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2009.
[6] K. Murphy, A. Torralba, and W. Freeman, "Using the Forest to See the Trees: A Graphical Model Relating Features, Objects, and Scenes," Proc. Advances in Neural Information Processing Systems, 2003.
[7] M. Marszalek, I. Laptev, and C. Schmid, "Actions in Context," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2009.
[8] J. Shotton, J. Winn, C. Rother, and A. Criminisi, "TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-Class Object Recognition and Segmentation," Proc. European Conf. Computer Vision, 2006.
[9] M. Everingham, L.V. Gool, C. Williams, J. Winn, and A. Zisserman, "The PASCAL VOC2008 Results," 2008.
[10] C. Desai, D. Ramanan, and C. Fowlkes, "Discriminative Models for Multi-Class Object Layout," Proc. 12th IEEE Int'l Conf. Computer Vision, 2009.
[11] H. Harzallah, F. Jurie, and C. Schmid, "Combining Efficient Object Localization and Image Classification," Proc. 12th IEEE Int'l Conf. Computer Vision, 2009.
[12] B. Leibe, A. Leonardis, and B. Schiele, "Combined Object Categorization and Segmentation with an Implicit Shape Model," Proc. ECCV Workshop Statistical Learning in Computer Vision, 2004.
[13] J. Henderson, "Human Gaze Control during Real-World Scene Perception," Trends in Cognitive Sciences, vol. 7, no. 11, pp. 498-504, 2003.
[14] A. Gupta, A. Kembhavi, and L. Davis, "Observing Human-Object Interactions: Using Spatial and Functional Compatibility for Recognition," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 10, pp. 1775-1789, Oct. 2009.
[15] B. Yao and L. Fei-Fei, "Grouplet: A Structured Image Representation for Recognizing Human and Object Interactions," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010.
[16] B. Yao, A. Khosla, and L. Fei-Fei, "Classifying Actions and Measuring Action Similarity by Modeling the Mutual Context of Objects and Human Poses," Proc. Int'l Conf. Machine Learning, 2011.
[17] D. Bub and M. Masson, "Gestural Knowledge Evoked by Objects as Part of Conceptual Representations," Aphasiology, vol. 20, pp. 1112-1124, 2006.
[18] H. Helbig, M. Graf, and M. Kiefer, "The Role of Action Representation in Visual Object," Experimental Brain Research, vol. 174, pp. 221-228, 2006.
[19] P. Bach, G. Knoblich, T. Gunter, A. Friederici, and W. Prinz, "Action Comprehension: Deriving Spatial and Functional Relations," J. Experimental Psychology: Human Perception and Performance, vol. 31, no. 3, pp. 465-479, 2005.
[20] A. Efros, A. Berg, G. Mori, and J. Malik, "Recognizing Action at a Distance," Proc. Ninth IEEE Int'l Conf. Computer Vision, 2003.
[21] I. Laptev, "On Space-Time Interest Points," Int'l J. Computer Vision, vol. 64, nos. 2/3, pp. 107-123, 2005.
[22] J. Liu, J. Luo, and M. Shah, "Recognizing Realistic Actions from Videos in the Wild," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2009.
[23] J. Niebles, C. Chen, and L. Fei-Fei, "Modeling Temporal Structure of Decomposable Motion Segments for Activity Classification," Proc. European Conf. Computer Vision, 2010.
[24] P. Felzenszwalb and D. Huttenlocher, "Pictorial Structures for Object Recognition," Int'l J. Computer Vision, vol. 61, no. 1, pp. 55-79, 2005.
[25] D. Ramanan, "Learning to Parse Images of Articulated Objects," Proc. Advances in Neural Information Processing Systems, 2006.
[26] M. Andriluka, S. Roth, and B. Schiele, "Pictorial Structures Revisited: People Detection and Articulated Pose Estimation," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2009.
[27] M. Eichner and V. Ferrari, "Better Appearance Models for Pictorial Structures," Proc. British Machine Vision Conference, 2009.
[28] B. Sapp, A. Toshev, and B. Taskar, "Cascade Models for Articulated Pose Estimation," Proc. European Conf. Computer Vision, 2010.
[29] Y. Yang and D. Ramanan, "Articulated Pose Estimation with Flexible Mixture-of-Parts," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2011.
[30] X. Ren, A. Berg, and J. Malik, "Recovering Human Body Configurations Using Pairwise Constraints between Parts," Proc. 10th IEEE Int'l Conf. Computer Vision, 2005.
[31] Y. Wang and G. Mori, "Multiple Tree Models for Occlusion and Spatial Constraints in Human Pose Estimation," Proc. European Conf. Computer Vision, 2008.
[32] L. Breiman, "Random Forests," Machine Learning, vol. 45, no. 1, pp. 5-32, 2001.
[33] J. Shotton, A. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore, A. Kipman, and A. Blake, "Real-Time Human Pose Recognition in Parts from Single Depth Images," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2011.
[34] P. Viola and M. Jones, "Robust Real-Time Object Detection," Int'l J. Computer Vision, vol. 57, no. 2, pp. 137-154, 2001.
[35] C. Lampert, M. Blaschko, and T. Hofmann, "Beyond Sliding Windows: Object Localization by Efficient Subwindow Search," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[36] P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan, "Object Detection with Discriminatively Trained Part-Based Models," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 32, no. 9, pp. 1627-1645, Sept. 2010.
[37] D. Hoiem, A. Efros, and M. Hebert, "Putting Objects in Perspective," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2006.
[38] A. Gupta, T. Chen, F. Chen, D. Kimber, and L. Davis, "Context and Observation Driven Latent Variable Model for Human Pose Estimation," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[39] H. Kjellstrom, D. Kragic, and M. Black, "Tracking People Interacting with Objects," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010.
[40] B. Rosenhahn, C. Schmaltz, T. Brox, J. Weickert, and H.-P. Seidel, "Staying Well Grounded in Markerless Motion Capture," Proc. Symp. German Assoc. for Pattern Recognition, 2008.
[41] M. Brubaker, L. Sigal, and D. Fleet, "Estimating Contact Dynamics," Proc. 12th IEEE Int'l Conf. Computer Vision, 2009.
[42] C. Desai, D. Ramanan, and C. Fowlkes, "Discriminative Models for Static Human-Object Interactions," Proc. IEEE CS Conf. Computer Vision and Computer Recognition Workshops, 2010.
[43] W. Yang, Y. Wang, and G. Mori, "Recognizing Human Actions from Still Images with Latent Poses," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010.
[44] V. Delaitre, I. Laptev, and J. Sivic, "Recognizing Human Actions in Still Images: A Study of Bag-of-Features and Part-Based Representations," Proc. British Machine Vision Conf., 2010.
[45] S. Maji, L. Bourdev, and J. Malik, "Action Recognition from a Distributed Representation of Pose and Appearance," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2011.
[46] A. Prest, C. Schmid, and V. Ferrari, "Weakly Supervised Learning of Interaction between Humans and Objects," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 34, no. 3, pp. 601-614, Mar. 2012.
[47] L. Jie, B. Caputo, and V. Ferrari, "Who's Doing What: Joint Modeling of Names and Verbs for Simultaneous Face and Pose Annotation," Proc. Advances in Neural Information Processing Systems, 2009.
[48] V. Singh, F. Khan, and R. Nevatia, "Multiple Pose Context Trees for Estimating Human Pose in Object Context," Proc. IEEE CS Conf. Computer Vision and Computer Recognition Workshops, 2010.
[49] M. Sadeghi and A. Farhadi, "Recognition Using Visual Phrases," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2011.
[50] B. Yao and L. Fei-Fei, "Modeling Mutual Context of Object and Human Pose in Human-Object Interaction Activities," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010.
[51] J. Lafferty, A. McCallum, and F. Pereira, "Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data," Proc. Int'l Conf. Machine Learning, 2001.
[52] J. Liebelt, C. Schmid, and K. Schertler, "Viewpoint-Independent Object Class Detection Using 3D Feature Maps," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[53] Y. Wang, H. Jiang, M. Drew, Z.-N. Li, and G. Mori, "Unsupervised Discovery of Action Classes," Proc. IEEE CS Conf. Computer Vision and Computer Recognition, 2006.
[54] L. Bourdev and J. Malik, "Poselets: Body Part Detectors Trained Using 3D Human Pose Annotations," Proc. 12th IEEE Int'l Conf. Computer Vision, 2009.
[55] N. Dalal and B. Triggs, "Histograms of Oriented Gradients for Human Detection," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2005.
[56] S. Lazebnik, C. Schmid, and J. Ponce, "Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2006.
[57] D. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints," Int'l J. Computer Vision, vol. 60, no. 2, pp. 91-110, 2004.
[58] V. Ferrari, M. Marín-Jiménez, and A. Zisserman, "Progressive Search Space Reduction for Human Pose Estimation," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.

