| | This Article | |
| |
| |
| | Share | |
| |
| |
| | Bibliographic References | |
| |
| |
| | Add to: | |
| |
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
| |
| | Search | |
| |
| |
| | |
Discovery and Segmentation of Activities in Video
August 2000 (vol. 22 no. 8)
pp. 844-851
Abstract—Hidden Markov models (HMMs) have become the workhorses of the monitoring and event recognition literature because they bring to time-series analysis the utility of density estimation and the convenience of dynamic time warping. Once trained, the internals of these models are considered opaque; there is no effort to interpret the hidden states. We show that by minimizing the entropy of the joint distribution, an HMM's internal state machine can be made to organize observed activity into meaningful states. This has uses in video monitoring and annotation, low bit-rate coding of scene activity, and detection of anomalous behavior. We demonstrate with models of office activity and outdoor traffic, showing how the framework learns principal modes of activity and patterns of activity change. We then show how this framework can be adapted to infer hidden state from extremely ambiguous images, in particular, inferring 3D body orientation and pose from sequences of low-resolution silhouettes.
[1] 844 L. Baum, T. Petrie, G. Soules, and N. Weiss, “A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains,” Annals of Math. Statistics, vol. 41, no. 1, pp. 164-171, 1970.[2] Y. Bengio and P. Frasconi, “Diffusion of Credit in Markovian Models,” Advances in Neural Information Processing Systems, G. Tesauro, D.S. Touretzky, and T. Leen, eds., vol. 7, pp. 553-560, MIT Press, 1995.[3] M. Brand, “Pattern Discovery via Entropy Minimization,” Artificial Intelligence and Statistics, D. Heckerman and C. Whittaker, eds., no. 7, Morgan Kaufmann, 1999.[4] M. Brand, “Shadow Puppetry,” Proc. Int'l Conf. Computer Vision, 1999.[5] M. Brand, “Structure Discovery in Conditional Probability Models via an Entropic Prior and Parameter Extinction,” Neural Computation, vol. 11, no. 5, pp. 1,155-1.182, 1999.[6] M. Brand, “Exploring Variational Structure by Cross-Entropy Optimization,” Proc. Int'l Conf. Machine Learning, P. Langley, ed., 2000.[7] W.E.L. Grimson, L. Lee, R. Romano, and C. Stauffer, “Using Adaptive Tracking to Classify and Monitor Activities in a Site,“ IEEE Proc. Computer Vision and Pattern Recognition, pp. 22-31, 1998.[8] F. Jelinek, Statistical Methods for Speech Recognition, chapter 7. Cambridge, Mass.: MIT Press, 1998.[9] B. Juang, S. Levinson, and M. Sondhi, “Maximum Likelihood Estimation for Multivariate Mixture Observations of Markov Sources,” IEEE Trans. Information Theory, vol. 32,no. no. 2, pp. 307-309, 1986.[10] L. Liporace, “Maximum Likelihood Estimation for Multivariate Observations of Markov Sources,” IEEE Trans. Information Theory, vol. 28, no. 5, pp. 729-734, 1982.[11] Proc. Int'l Conf. Automatic Face and Gesture Recognition, A. Pentland and I. Essa, eds., 1997.[12] L.R. Rabiner, “Tutorial on Hidden Markov Model and Selected Applications in Speech Recognition,” Proc. IEEE, vol. 77, no. 2, pp. 257-285, 1989.[13] J. Rissanen, Stochastic Complexity in Statistical Inquiry. World Scientific Series in Computer Science, vol. 15, 1989.[14] Proc. DARPA Image Understanding Workshop, T. Strat, ed., 1998.[15] P. Vitanyi and M. Li, “Ideal MDL and Its Relation to Bayesianism,” ISIS: Information, Statistics and Induction in Science, pp. 282-291, Singapore: World Scientific, 1996.[16] C. Wallace and P. Freeman, “Estimation and Inference by Compact Coding,” J. Royal Statistical Soc., Series B, vol. 49, pp. 240-251, 1987.[17] C. Wren, A. Azarbayejani, T. Darrell, and A. Pentland, “Pfinder: Real-Time Tracking of the Human Body,” Proc. SPIE, vol. 2,615, 1995.[18] Proc. Int'l Conf. Automatic Face and Gesture Recognition, M. Yachida, ed., 1998.
Index Terms:
Video activity monitoring, hidden Markov models, hidden state, parameter estimation, entropy minimization.
Citation:
Matthew Brand, Vera Kettnaker, "Discovery and Segmentation of Activities in Video," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 844-851, Aug. 2000, doi:10.1109/34.868685