The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.05 - May (2008 vol.30)
pp: 909-926
ABSTRACT
A dynamic texture is a spatio-temporal generative model for video, which represents video sequences as observations from a linear dynamical system. This work studies the mixture of dynamic textures, a statistical model for an ensemble of video sequences that is sampled from a finite collection of visual processes, each of which is a dynamic texture. An expectationmaximization (EM) algorithm is derived for learning the parameters of the model, and the model is related to previous works in linear systems, machine learning, time-series clustering, control theory, and computer vision. Through experimentation, it is shown that the mixture of dynamic textures is a suitable representation for both the appearance and dynamics of a variety of visual processes that have traditionally been challenging for computer vision (e.g. fire, steam, water, vehicle and pedestrian traffic, etc.). When compared with state-of-the-art methods in motion segmentation, including both temporal texture methods and traditional representations (e.g. optical flow or other localized motion representations), the mixture of dynamic textures achieves superior performance in the problems of clustering and segmenting video of such processes.
INDEX TERMS
Dynamic texture, temporal textures, video modeling, video clustering, motion segmentation, mixture models, linear dynamical systems, time-series clustering, Kalman filter, probabilistic models, expectation-maximization
CITATION
Antoni B. Chan, Nuno Vasconcelos, "Modeling, Clustering, and Segmenting Video with Mixtures of Dynamic Textures", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.30, no. 5, pp. 909-926, May 2008, doi:10.1109/TPAMI.2007.70738
REFERENCES
[1] B.K.P. Horn, Robot Vision. McGraw-Hill Book, 1986.
[2] B. Horn and B. Schunk, “Determining Optical Flow,” Artificial Intelligence, vol. 17, pp. 185-204, 1981.
[3] B. Lucas and T. Kanade, “An Iterative Image Registration Technique with an Application to Stereo Vision,” Proc. DARPA Image Understanding Workshop, pp. 121-130, 1981.
[4] J. Barron, D. Fleet, and S. Beauchemin, “Performance of Optical Flow Techniques,” Int'l J. Computer Vision, vol. 12, pp. 43-77, 1994.
[5] P. Anandan, J. Bergen, K. Hanna, and R. Hingorani, “Hierarchical Model-Based Motion Estimation,” Motion Analysis and Image Sequence Processing, pp. 1-22, 1993.
[6] J. Wang and E. Adelson, “Representing Moving Images with Layers,” IEEE Trans. Image Processing, vol. 3, no. 5, pp. 625-638, 1994.
[7] H. Sawhney and S. Ayer, “Compact Representations of Videos through Dominant and Multiple Motion Estimation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 8, pp. 814-830, Aug. 1996.
[8] M. Hansen, P. Anandan, K. Dana, G. Wal, and P. Burt, “Real-Time Scene Stabilization and Mosaic Construction,” Proc. DARPA Image Understanding Workshop, pp. 457-463, 1994.
[9] M. Isard and A. Blake, “Condensation—Conditional Density Propagation for Visual Tracking,” Int'l J. Computer Vision, vol. 29, no. 1, pp. 5-28, 1998.
[10] M. Irani, B. Rousso, and S. Peleg, “Detecting and Tracking Multiple Moving Objects Using Temporal Integration,” Proc. European Conf. Computer Vision, pp. 282-287, 1992.
[11] D. Comaniciu, V. Ramesh, and P. Meer, “Kernel-Based Object Tracking,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 5, pp. 564-575, May 2003.
[12] S. Soatto, G. Doretto, and Y.N. Wu, “Dynamic Textures,” Proc. IEEE Int'l Conf. Computer Vision, pp. 439-446, 2001.
[13] G. Doretto, A. Chiuso, Y.N. Wu, and S. Soatto, “Dynamic Textures,” Int'l J. Computer Vision, vol. 51, no. 2, pp. 91-109, 2003.
[14] A.W. Fitzgibbon, “Stochastic Rigidity: Image Registration for Nowhere-Static Scenes,” Proc. Int'l Conf. Computer Vision, vol. 1, pp. 662-670, 2001.
[15] G. Doretto, D. Cremers, P. Favaro, and S. Soatto, “Dynamic Texture Segmentation,” Proc. Int'l Conf. Computer Vision, vol. 2, pp.1236-1242, 2003.
[16] P. Saisan, G. Doretto, Y. Wu, and S. Soatto, “Dynamic Texture Recognition,” Proc. Computer Vision and Pattern Recognition, vol. 2, pp. 58-63, 2001.
[17] A.B. Chan and N. Vasconcelos, “Probabilistic Kernels for the Classification of Auto-Regressive Visual Processes,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 846-851, 2005.
[18] R. Vidal and A. Ravichandran, “Optical Flow Estimation and Segmentation of Multiple Moving Dynamic Textures,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 516-521, 2005.
[19] A.B. Chan and N. Vasconcelos, “Layered Dynamic Textures,” Advances of Neural Information Processing (NIPS) 18, pp. 203-210, 2006.
[20] A.B. Chan and N. Vasconcelos, “Mixtures of Dynamic Textures,” Proc. IEEE Int'l Conf. Computer Vision, vol. 1, pp. 641-647, 2005.
[21] L. Cooper, J. Liu, and K. Huang, “Spatial Segmentation of Temporal Texture Using Mixture Linear Models,” Proc. IEEE Int'l Conf. Computer Vision Dynamical Vision Workshop, 2005.
[22] A. Ghoreyshi and R. Vidal, “Segmenting Dynamic Textures with Ising Descriptors, ARX Models and Level Sets,” Proc. European Conf. Computer Vision Dynamical Vision Workshop, 2006.
[23] S.M. Kay, Fundamentals of Statistical Signal Processing: Estimation Theory. Prentice Hall, 1993.
[24] S. Roweis and Z. Ghahramani, “A Unifying Review of Linear Gaussian Models,” Neural Computation, vol. 11, no. 2, pp. 305-345, 1999.
[25] R.H. Shumway and D.S. Stoffer, “An Approach to Time Series Smoothing and Forecasting Using the EM Algorithm,” J. Time Series Analysis, vol. 3, no. 4, pp. 253-264, 1982.
[26] P.V. Overschee and B.D. Moor, “N4SID: Subspace Algorithms for the Identification of Combined Deterministic-Stochastic Systems,” Automatica, vol. 30, pp. 75-93, 1994.
[27] F.V. Jensen, Bayesian Networks and Decision Graphs. Springer, 2001.
[28] A.P. Dempster, N.M. Laird, and D.B. Rubin, “Maximum Likelihood from Incomplete Data via the EM Algorithm,” J. Royal Statistical Soc. B, vol. 39, pp. 1-38, 1977.
[29] S. Young, G. Evermann, M. Gales, T. Hain, D. Kershaw, X.Y. Liu, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P. Woodland, The HTK Book. Cambridge Univ. Eng. Dept., 2006.
[30] D.T. Magill, “Optimal Adaptive Estimation of Sampled Stochastic Processes,” IEEE Trans. Automatic Control, vol. 10, no. 4, pp. 434-439, 1965.
[31] D.G. Lainiotis, “Partitioning: A Unifying Framework for Adaptive Systems, I: Estimation; II: Control,” Proc. IEEE, vol. 64, no. 8, pp.1126-43-1182-98, 1976.
[32] K.S. Narendra and J. Balakrishnan, “Adaptive Control Using Multiple Models,” IEEE Trans. Automatic Control, vol. 42, no. 2, pp.171-187, 1997.
[33] R.G. Brown, “A New Look at Magill Adaptive Filter as a Practical Means of Multiple Hypothesis Testing,” IEEE Trans. Circuits and Systems, vol. 30, no. 10, pp. 765-768, 1983.
[34] V. Digalakis, J.R. Rohlicek, and M. Ostendorf, “ML Estimation of a Stochastic Linear System with the EM Algorithm and Its Application to Speech Recognition,” IEEE Trans. Speech and Audio Processing, vol. 1, no. 4, pp. 431-442, 1993.
[35] Z. Ghahramani and G. Hinton, “Parameter Estimation for Linear Dynamical Systems,” Technical Report CRG-TR-96-2, Dept. of Computer Science, Univ. of Toronto, 1996.
[36] Z. Ghahramani and G. Hinton, “The EM Algorithm for Mixtures of Factor Analyzers,” Technical Report CRG-TR-96-1, Dept. of Computer Science, Univ. of Toronto, 1997.
[37] R. Shumway and D. Stoffer, “Dynamic Linear Models with Switching,” J. Am. Statistical Assoc., vol. 86, pp. 763-769, 1991.
[38] Y. Wu, G. Hua, and T. Yu, “Switching Observation Models for Contour Tracking in Clutter,” Proc. Computer Vision and Pattern Recognition, pp. 295-302, 2003.
[39] M. Isard and A. Blake, “A Mixed-State Condensation Tracker with Automatic Model-Switching,” Proc. Int'l Conf. Computer Vision, pp.107-112, 1998.
[40] V. Pavlović, B.J. Frey, and T.S. Huang, “Time-Series Classification Using Mixed-State Dynamic Bayesian Networks,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 1999.
[41] V. Pavlović, J. Rehg, and J. MacCormick, “Learning Switching Linear Models of Human Motion,” Advances in Neural Information Processing Systems 13, 2000.
[42] C.-J. Kim, “Dynamic Linear Models with Markov-Switching,” J.Econometrics, vol. 60, pp. 1-22, 1994.
[43] S.M. Oh, J.M. Rehg, T. Balch, and F. Dellaert, “Learning and Inference in Parametric Switching Linear Dynamic Systems,” Proc. IEEE Int'l Conf. Computer Vision, vol. 2, pp. 1161-1168, 2005.
[44] Z. Ghahramani and G.E. Hinton, “Variational Learning for Switching State-Space Models,” Neural Computation, vol. 12, no. 4, pp. 831-864, 2000.
[45] T.W. Liao, “Clustering of Time Series Data—A Survey,” Pattern Recognition, vol. 38, pp. 1857-1874, 2005.
[46] Y. Kakizawa, R.H. Shumway, and M. Taniguchi, “Discrimination and Clustering for Multivariate Time Series,” J. Am. Statistical Assoc., vol. 93, no. 441, pp. 328-440, 1998.
[47] A. Singhal and D.E. Seborg, “Clustering of Multivariate Time-Series Data,” Proc. Am. Control Conf., vol. 5, pp. 3931-3936, 2002.
[48] Y. Xiong and D.-Y. Yeung, “Time Series Clustering with ARMA Mixtures,” Pattern Recognition, vol. 37, pp. 1675-1689, 2004.
[49] D.A. Forsyth and J. Ponce, Computer Vision: A Modern Approach. Prentice Hall, 2002.
[50] R. Duda, P. Hart, and D. Stork, Pattern Classification. John Wiley & Sons, 2001.
[51] C. Stauffer and E. Grimson, “Learning Patterns of Activity Using Real-Time Tracking,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 747-757, Aug. 2000.
[52] A. Jepson and M. Black, “Mixture Models for Optical Flow Computation,” Proc. Computer Vision and Pattern Recognition, pp.760-761, 1993.
[53] Y. Weiss, “Smoothness in Layers: Motion Segmentation Using Nonparametric Mixture Estimation,” Proc. Int'l Conf. Computer Vision, pp. 520-526, 1997.
[54] N. Vasconcelos and A. Lippman, “Empirical Bayesian Motion Segmentation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 2, pp. 217-221, Feb. 2001.
[55] B. Frey and N. Jojic, “Estimating Mixture Models of Images and Inferring Spatial Transformations Using the EM Algorithm,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 416-422, 1999.
[56] C. Carson, S. Belongie, H. Greenspan, and J. Malik, “Blobworld: Color- and Texture-Based Image Segmentation Using EM and Its Application to Image Querying and Classification,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 8, pp. 1026-1038, Aug. 2002.
[57] N. Vasconcelos, “Minimum Probability of Error Image Retrieval,” IEEE Trans. Signal Processing, vol. 52, no. 8, pp. 2322-2336, 2004.
[58] “Mixtures of Dynamic Textures,” http://www.svcl.ucsd.edu/projectsmotiondytex , 2008.
[59] L. Hubert and P. Arabie, “Comparing Partitions,” J. Classification, vol. 2, pp. 193-218, 1985.
[60] Washington State Dept. of Transportation, http:/www.wsdot. wa.gov, 2008.
[61] J. Shi and J. Malik, “Normalized Cuts and Image Segmentation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 888-905, Aug. 2000.
[62] J. Shi and J. Malik, “Motion Segmentation and Tracking Using Normalized Cuts,” Proc. Int'l Conf. Computer Vision, pp. 1154-1160, 1999.
[63] D. Comaniciu and P. Meer, “Mean Shift: A Robust Approach toward Feature Space Analysis,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 5, pp. 603-619, May 2002.
[64] D. Bauer, “Comparing the CCA Subspace Method to Pseudo Maximum Likelihood Methods in the Case of No Exogenous Inputs,” J. Time Series Analysis, vol. 26, pp. 631-668, 2005.
5 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool