loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A Bayesian Computer Vision System for Modeling Human Interactions
August 2000 (vol. 22 no. 8)
pp. 831-843

Abstract—We describe a real-time computer vision and machine learning system for modeling and recognizing human behaviors in a visual surveillance task [1]. The system is particularly concerned with detecting when interactions between people occur and classifying the type of interaction. Examples of interesting interaction behaviors include following another person, altering one's path to meet another, and so forth. Our system combines top-down with bottom-up information in a closed feedback loop, with both components employing a statistical Bayesian approach [2]. We propose and compare two different state-based learning architectures, namely, HMMs and CHMMs for modeling behaviors and interactions. The CHMM model is shown to work much more efficiently and accurately. Finally, to deal with the problem of limited training data, a synthetic “Alife-style” training system is used to develop flexible prior models for recognizing human interactions. We demonstrate the ability to use these a priori models to accurately classify real human behaviors and interactions with no additional tuning or training.

[1] 831 N. Oliver, B. Rosario, and A. Pentland, “A Bayesian Computer Vision System for Modeling Human Interactions,” Proc. Int'l Conf. Vision Systems '99, Jan. 1999.[2] N. Oliver, “Towards Perceptual Intelligence: Statistical Modeling of Human Individual and Interactive Behaviors,” PhD thesis, Massachusetts Institute of Technology (MIT), Media Lab, Cambridge, Mass., 2000.[3] T. Darrell and A. Pentland, “Active Gesture Recognition Using Partially Observable Markov Decision Processes,” Int'l Conf. Pattern Recognition, vol. 5, p. C9E, 1996.[4] A.F. Bobick, “Computers Seeing Action,” Proc. British Machine Vision Conf., vol. 1, pp. 13-22, 1996.[5] A. Pentland and A. Liu, “Modeling and Prediction of Human Behavior,” Defense Advanced Research Projects Agency, pp. 201-206, 1997.[6] H. Buxton and S. Gong, “Advanced Visual Surveillance Using Bayesian Networks,” Int'l Conf. Computer Vision, June 1995.[7] H.H. Nagel, “From Image Sequences Towards Conceptual Descriptions,” Image and Vision Computing, vol. 6, no. 2, pp. 59-74, May 1988.[8] T. Huang, D. Koller, J. Malik, G. Ogasawara, B. Rao, S. Russel, and J. Weber, “Automatic Symbolic Traffic Scene Analysis Using Belief Networks,” Proc. 12th Nat'l Conf. Artifical Intelligence, pp. 966-972, 1994.[9] C. Castel, L. Chaudron, and C. Tessier, “What is Going On? A High Level Interpretation of Sequences of Images,” Proc. Workshop on Conceptual Descriptions from Images, European Conf. Computer Vision, pp. 13-27, 1996.[10] J.H. Fernyhough, A.G. Cohn, and D.C. Hogg, “Building Qualitative Event Models Automatically from Visual Input,” Proc. Int'l Conf. Computer Vision, pp. 350-355, 1998.[11] W.L. Buntine, “Operations for Learning with Graphical Models,” J. Artificial Intelligence Research, 1994.[12] L.R. Rabiner, “Tutorial on Hidden Markov Model and Selected Applications in Speech Recognition,” Proc. IEEE, vol. 77, no. 2, pp. 257-285, 1989.[13] M. Brand, N. Oliver, and A. Pentland, “Coupled Hidden Markov Models for Complex Action Recognition,” Proc. IEEE Computer Vision and Pattern Recognition, 1996.[14] M. Brand, “Coupled Hidden Markov Models for Modeling Interacting Processes,” Neural Computation, Nov. 1996.[15] N. Oliver, B. Rosario, and A. Pentland, “Graphical Models for Recognizing Human Interactions,” Proc. Neural Information Processing Systems, Nov. 1998.[16] N. Oliver, B. Rosario, and A. Pentland, “A Synthetic Agent System for Modeling Human Interactions,” Technical Report, Vision and Modeling Media Lab, MIT, Cambridge, Mass., 1998. http://whitechapel.media.mit.edu/pubtech-reports .[17] B. Rosario, N. Oliver, and A. Pentland, “A Synthetic Agent System for Modeling Human Interactions,” Proc. AA, 1999.[18] R.K. Bajcsy, “Active Perception vs. Passive Perception,” Proc. CASE Vendor's Workshop, pp. 55-62, 1985.[19] A. Pentland, “Classification by Clustering,” Proc. IEEE Symp. Machine Processing and Remotely Sensed Data, 1976.[20] R. Kauth, A. Pentland, and G. Thomas, “Blob: An Unsupervised Clustering Approach to Spatial Preprocessing of MSS Imagery,” 11th Int'l Symp. Remote Sensing of the Environment, 1977.[21] A. Bobick and R. Bolles, “The Representation Space Paradigm of Concurrent Evolving Object Descriptions,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 14, no. 2, pp. 146-156, Feb. 1992.[22] C. Wren, A. Azarbayejani, T. Darrell, and A. Pentland, “Pfinder: Real-time Tracking of the Human Body,” Photonics East, SPIE, vol. 2,615, 1995.[23] N. Oliver, F. Bérard, and A. Pentland, “Lafter: Lips and Face Tracking,” Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition (CVPR `97), June 1997.[24] B. Moghaddam and A. Pentland, "Probabilistic Visual Learning for Object Detection," Int'l Conf. Computer Vision, 1995, pp. 786-793.[25] C. Wren, A. Azarbayejani, T. Darrell, and A.P. Pentland, Pfinder: Real-Time Tracking of the Human Body IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 780-785, July 1997.[26] W.L. Buntine, “A Guide to the Literature on Learning Probabilistic Networks from Data,” IEEE Trans. Knowledge and Data Engineering, 1996.[27] D. Heckerman, “A Tutorial on Learning with Bayesian Networks,” Technical Report MSR-TR-95-06, Microsoft Research, Redmond, Wash., 1995, revised June 1996.[28] L.K. Saul and M.I. Jordan, “Boltzmann Chains and Hidden Markov Models,” Proc. Neural Information Processing Systems, G. Tesauro, D.S. Touretzky, and T.K. Leen, eds., vol. 7, 1995.[29] Z. Ghahramani and M.I. Jordan, “Factorial Hidden Markov Models,” Proc. Neural Information Processing Systems, D.S. Touretzky, M.C. Mozer, and M.E. Hasselmo, eds., vol. 8, 1996.[30] P Smyth, D. Heckerman, and M. Jordan, “Probabilistic Independence Networks for Hidden Markov Probability Models,” AI memo 1565, MIT, Cambridge, Mass., Feb. 1996.[31] C. Williams and G.E. Hinton, “Mean Field Networks That Learn to Discriminate Temporally Distorted Strings,” Proc. Connectionist Models Summer School, pp. 18-22, 1990.[32] D. Stork and M. Hennecke, “Speechreading: An Overview of Image Procssing, Feature Extraction, Sensory Integration and Pattern Recognition Techniques,” Proc. Int'l Conf. Automatic Face and Gesture Recognition, 1996.[33] M.I. Jordan, Z. Ghahramani, and L.K. Saul, “Hidden Markov Decision Trees,” Proc. Neural Information Processing Systems, D.S. Touretzky, M.C. Mozer, and M.E. Hasselmo, eds., vol. 8, 1996.[34] F.V. Jensen, S.L. Lauritzen, and K.G. Olesen, “Bayesian Updating in Recursive Graphical Models by Local Computations,” Computational Statistical Quarterly, vol. 4, pp. 269-282, 1990.

Index Terms:
Visual surveillance, people detection, tracking, human behavior recognition, Hidden Markov Models.
Citation:
Nuria M. Oliver, Barbara Rosario, Alex P. Pentland, "A Bayesian Computer Vision System for Modeling Human Interactions," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 831-843, Aug. 2000, doi:10.1109/34.868684
Usage of this product signifies your acceptance of the Terms of Use.