CSDL Home IEEE Transactions on Pattern Analysis & Machine Intelligence 2008 vol.30 Issue No.05 - May

Subscribe

Issue No.05 - May (2008 vol.30)

pp: 909-926

Antoni B. Chan , IEEE

ABSTRACT

A dynamic texture is a spatio-temporal generative model for video, which represents video sequences as observations from a linear dynamical system. This work studies the mixture of dynamic textures, a statistical model for an ensemble of video sequences that is sampled from a finite collection of visual processes, each of which is a dynamic texture. An expectationmaximization (EM) algorithm is derived for learning the parameters of the model, and the model is related to previous works in linear systems, machine learning, time-series clustering, control theory, and computer vision. Through experimentation, it is shown that the mixture of dynamic textures is a suitable representation for both the appearance and dynamics of a variety of visual processes that have traditionally been challenging for computer vision (e.g. fire, steam, water, vehicle and pedestrian traffic, etc.). When compared with state-of-the-art methods in motion segmentation, including both temporal texture methods and traditional representations (e.g. optical flow or other localized motion representations), the mixture of dynamic textures achieves superior performance in the problems of clustering and segmenting video of such processes.

INDEX TERMS

Dynamic texture, temporal textures, video modeling, video clustering, motion segmentation, mixture models, linear dynamical systems, time-series clustering, Kalman filter, probabilistic models, expectation-maximization

CITATION

Antoni B. Chan, "Modeling, Clustering, and Segmenting Video with Mixtures of Dynamic Textures",

*IEEE Transactions on Pattern Analysis & Machine Intelligence*, vol.30, no. 5, pp. 909-926, May 2008, doi:10.1109/TPAMI.2007.70738REFERENCES

- [1] B.K.P. Horn,
Robot Vision. McGraw-Hill Book, 1986.- [3] B. Lucas and T. Kanade, “An Iterative Image Registration Technique with an Application to Stereo Vision,”
Proc. DARPA Image Understanding Workshop, pp. 121-130, 1981.- [5] P. Anandan, J. Bergen, K. Hanna, and R. Hingorani, “Hierarchical Model-Based Motion Estimation,”
Motion Analysis and Image Sequence Processing, pp. 1-22, 1993.- [8] M. Hansen, P. Anandan, K. Dana, G. Wal, and P. Burt, “Real-Time Scene Stabilization and Mosaic Construction,”
Proc. DARPA Image Understanding Workshop, pp. 457-463, 1994.- [9] M. Isard and A. Blake, “Condensation—Conditional Density Propagation for Visual Tracking,”
Int'l J. Computer Vision, vol. 29, no. 1, pp. 5-28, 1998.- [10] M. Irani, B. Rousso, and S. Peleg, “Detecting and Tracking Multiple Moving Objects Using Temporal Integration,”
Proc. European Conf. Computer Vision, pp. 282-287, 1992.- [13] G. Doretto, A. Chiuso, Y.N. Wu, and S. Soatto, “Dynamic Textures,”
Int'l J. Computer Vision, vol. 51, no. 2, pp. 91-109, 2003.- [16] P. Saisan, G. Doretto, Y. Wu, and S. Soatto, “Dynamic Texture Recognition,”
Proc. Computer Vision and Pattern Recognition, vol. 2, pp. 58-63, 2001.- [19] A.B. Chan and N. Vasconcelos, “Layered Dynamic Textures,”
Advances of Neural Information Processing (NIPS) 18, pp. 203-210, 2006.- [21] L. Cooper, J. Liu, and K. Huang, “Spatial Segmentation of Temporal Texture Using Mixture Linear Models,”
Proc. IEEE Int'l Conf. Computer Vision Dynamical Vision Workshop, 2005.- [22] A. Ghoreyshi and R. Vidal, “Segmenting Dynamic Textures with Ising Descriptors, ARX Models and Level Sets,”
Proc. European Conf. Computer Vision Dynamical Vision Workshop, 2006.- [23] S.M. Kay,
Fundamentals of Statistical Signal Processing: Estimation Theory. Prentice Hall, 1993.- [27] F.V. Jensen,
Bayesian Networks and Decision Graphs. Springer, 2001.- [28] A.P. Dempster, N.M. Laird, and D.B. Rubin, “Maximum Likelihood from Incomplete Data via the EM Algorithm,”
J. Royal Statistical Soc. B, vol. 39, pp. 1-38, 1977.- [29] S. Young, G. Evermann, M. Gales, T. Hain, D. Kershaw, X.Y. Liu, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P. Woodland,
The HTK Book. Cambridge Univ. Eng. Dept., 2006.- [35] Z. Ghahramani and G. Hinton, “Parameter Estimation for Linear Dynamical Systems,” Technical Report CRG-TR-96-2, Dept. of Computer Science, Univ. of Toronto, 1996.
- [36] Z. Ghahramani and G. Hinton, “The EM Algorithm for Mixtures of Factor Analyzers,” Technical Report CRG-TR-96-1, Dept. of Computer Science, Univ. of Toronto, 1997.
- [38] Y. Wu, G. Hua, and T. Yu, “Switching Observation Models for Contour Tracking in Clutter,”
Proc. Computer Vision and Pattern Recognition, pp. 295-302, 2003.- [39] M. Isard and A. Blake, “A Mixed-State Condensation Tracker with Automatic Model-Switching,”
Proc. Int'l Conf. Computer Vision, pp.107-112, 1998.- [40] V. Pavlović, B.J. Frey, and T.S. Huang, “Time-Series Classification Using Mixed-State Dynamic Bayesian Networks,”
Proc. IEEE Conf. Computer Vision and Pattern Recognition, 1999.- [41] V. Pavlović, J. Rehg, and J. MacCormick, “Learning Switching Linear Models of Human Motion,”
Advances in Neural Information Processing Systems 13, 2000.- [42] C.-J. Kim, “Dynamic Linear Models with Markov-Switching,”
J.Econometrics, vol. 60, pp. 1-22, 1994.- [43] S.M. Oh, J.M. Rehg, T. Balch, and F. Dellaert, “Learning and Inference in Parametric Switching Linear Dynamic Systems,”
Proc. IEEE Int'l Conf. Computer Vision, vol. 2, pp. 1161-1168, 2005.- [45] T.W. Liao, “Clustering of Time Series Data—A Survey,”
Pattern Recognition, vol. 38, pp. 1857-1874, 2005.- [48] Y. Xiong and D.-Y. Yeung, “Time Series Clustering with ARMA Mixtures,”
Pattern Recognition, vol. 37, pp. 1675-1689, 2004.- [49] D.A. Forsyth and J. Ponce,
Computer Vision: A Modern Approach. Prentice Hall, 2002.- [50] R. Duda, P. Hart, and D. Stork,
Pattern Classification. John Wiley & Sons, 2001.- [55] B. Frey and N. Jojic, “Estimating Mixture Models of Images and Inferring Spatial Transformations Using the EM Algorithm,”
Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 416-422, 1999.- [56] C. Carson, S. Belongie, H. Greenspan, and J. Malik, “Blobworld: Color- and Texture-Based Image Segmentation Using EM and Its Application to Image Querying and Classification,”
IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 8, pp. 1026-1038, Aug. 2002.- [58] “Mixtures of Dynamic Textures,” http://www.svcl.ucsd.edu/projectsmotiondytex , 2008.
- [60] Washington State Dept. of Transportation, http:/www.wsdot. wa.gov, 2008.
- [62] J. Shi and J. Malik, “Motion Segmentation and Tracking Using Normalized Cuts,”
Proc. Int'l Conf. Computer Vision, pp. 1154-1160, 1999. |