The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.10 - October (2009 vol.31)
pp: 1862-1879
Antoni B. Chan , University of California, San Diego, La Jolla
Nuno Vasconcelos , University of California, San Diego, La Jolla
ABSTRACT
A novel video representation, the layered dynamic texture (LDT), is proposed. The LDT is a generative model, which represents a video as a collection of stochastic layers of different appearance and dynamics. Each layer is modeled as a temporal texture sampled from a different linear dynamical system. The LDT model includes these systems, a collection of hidden layer assignment variables (which control the assignment of pixels to layers), and a Markov random field prior on these variables (which encourages smooth segmentations). An EM algorithm is derived for maximum-likelihood estimation of the model parameters from a training video. It is shown that exact inference is intractable, a problem which is addressed by the introduction of two approximate inference procedures: a Gibbs sampler and a computationally efficient variational approximation. The trade-off between the quality of the two approximations and their complexity is studied experimentally. The ability of the LDT to segment videos into layers of coherent appearance and dynamics is also evaluated, on both synthetic and natural videos. These experiments show that the model possesses an ability to group regions of globally homogeneous, but locally heterogeneous, stochastic dynamics currently unparalleled in the literature.
INDEX TERMS
Dynamic texture, temporal textures, video modeling, motion segmentation, mixture models, linear dynamical systems, Kalman filter, Markov random fields, probabilistic models, expectation-maximization, variational approximation, Gibbs sampling.
CITATION
Antoni B. Chan, Nuno Vasconcelos, "Layered Dynamic Textures", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.31, no. 10, pp. 1862-1879, October 2009, doi:10.1109/TPAMI.2009.110
REFERENCES
[1] B.K.P. Horn, Robot Vision. McGraw-Hill Book Company, 1986.
[2] B. Horn and B. Schunk, “Determining Optical Flow,” Artificial Intelligence, vol. 17, pp. 185-204, 1981.
[3] B. Lucas and T. Kanade, “An Iterative Image Registration Technique with an Application to Stereo Vision,” Proc. DARPA Image Understanding Workshop, pp. 121-130, 1981.
[4] J. Barron, D. Fleet, and S. Beauchemin, “Performance of Optical Flow Techniques,” Int'l J. Computer Vision, vol. 12, pp. 43-77, 1994.
[5] J. Wang and E. Adelson, “Representing Moving Images with Layers,” IEEE Trans. Image Processing, vol. 3, no. 5, pp. 625-638, Sept. 1994.
[6] B. Frey and N. Jojic, “Estimating Mixture Models of Images and Inferring Spatial Transformations Using the EM Algorithm,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 416-422, 1999.
[7] G. Doretto, A. Chiuso, Y.N. Wu, and S. Soatto, “Dynamic Textures,” Int'l J. Computer Vision, vol. 51, no. 2, pp. 91-109, 2003.
[8] G. Doretto, D. Cremers, P. Favaro, and S. Soatto, “Dynamic Texture Segmentation,” Proc. Int'l Conf. Computer Vision, vol. 2, pp.1236-1242, 2003.
[9] P. Saisan, G. Doretto, Y. Wu, and S. Soatto, “Dynamic Texture Recognition,” Proc. IEEE. Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 58-63, 2001.
[10] A.B. Chan and N. Vasconcelos, “Probabilistic Kernels for the Classification of Auto-Regressive Visual Processes,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 846-851, 2005.
[11] R. Vidal and A. Ravichandran, “Optical Flow Estimation & Segmentation of Multiple Moving Dynamic Textures,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 516-521, 2005.
[12] A. Ghoreyshi and R. Vidal, “Segmenting Dynamic Textures with Ising Descriptors, ARX Models and Level Sets,” Proc. Dynamical Vision Workshop in the European Conf. Computer Vision, 2006.
[13] A.B. Chan and N. Vasconcelos, “Layered Dynamic Textures,” Advances in Neural Information Processing Systems, vol. 18, pp. 203-210, 2006.
[14] S. Soatto, G. Doretto, and Y.N. Wu, “Dynamic Textures,” Proc. IEEE Int'l Conf. Computer Vision, pp. 439-446, 2001.
[15] L. Yuan, F. Wen, C. Liu, and H.-Y. Shum, “Synthesizing Dynamic Textures with Closed-Loop Linear Dynamic Systems,” Proc. European Conf. Computer Vision, pp. 603-616, 2004.
[16] B. Ghanem and N. Ahuja, “Phase Based Modelling of Dynamic Textures,” Proc. IEEE Int'l Conf. Computer Vision, 2007.
[17] A.B. Chan and N. Vasconcelos, “Modeling, Clustering, and Segmenting Video with Mixtures of Dynamic Textures,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 5, pp.909-926, May 2008.
[18] A.W. Fitzgibbon, “Stochastic Rigidity: Image Registration for Nowhere-Static Scenes,” Proc. Int'l Conf. Computer Vision, vol. 1, pp. 662-670, 2001.
[19] S.V.N. Vishwanathan, A.J. Smola, and R. Vidal, “Binet-cauchy Kernels on Dynamical Systems and Its Application to the Analysis of Dynamic Scenes,” Int'l J. Computer Vision, vol. 73, no. 1, pp. 95-119, 2007.
[20] A.B. Chan and N. Vasconcelos, “Classifying Video with Kernel Dynamic Textures,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2007.
[21] E. Cetingul, R. Chaudhry, and R. Vidal, “A System Theoretic Approach to Synthesis and Classification of Lip Articulation,” Proc. Int'l Workshop Dynamical Vision, 2007.
[22] R. Vidal and P. Favaro, “Dynamicboost: Boosting Time Series Generated by Dynamical Systems,” Proc. Int'l Conf. Computer Vision, 2007.
[23] S.M. Siddiqi, B. Boots, and G.J. Gordon, “A Constraint Generation Approach to Learning Stable Linear Dynamical Systems,” Advances in Neural Information Processing Systems, 2007.
[24] R. Costantini, L. Sbaiz, and S. Süsstrunk, “Higher Order SVD Analysis for Dynamic Texture Synthesis,” IEEE Trans. Image Processing, vol. 17, no. 1, pp. 42-52, Jan. 2008.
[25] M. Szummer and R. Picard, “Temporal Texture Modeling,” Proc. IEEE Conf. Image Processing, vol. 3, pp. 823-826, 1996.
[26] G. Doretto, E. Jones, and S. Soatto, “Spatially Homogeneous Dynamic Textures,” Proc. European Conf. Computer Vision, 2004.
[27] C.-B. Liu, R.-S. Lin, and N. Ahuja, “Modeling Dynamic Textures Using Subspace Mixtures,” Proc. Int'l Conf. Multimedia and Expo, pp. 1378-1381, 2005.
[28] C.-B. Liu, R.-S. Lin, N. Ahuja, and M.-H. Yang, “Dynamic Texture Synthesis as Nonlinear Manifold Learning and Traversing,” Proc. British Machine Vision Conf., vol. 2, pp. 859-868, 2006.
[29] G. Doretto and S. Soatto, “Dynamic Shape and Appearance Models,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, no. 12, pp. 2006-2019, Dec. 2006.
[30] R. Vidal, “Online Clustering of Moving Hyperplanes,” Advances in Neural Information Processing Systems, 2006.
[31] L. Cooper, J. Liu, and K. Huang, “Spatial Segmentation of Temporal Texture Using Mixture Linear Models,” Proc. Dynamical Vision Workshop in the IEEE Intl. Conf. Computer Vision, 2005.
[32] S. Ali and M. Shah, “A Lagrangian Particle Dynamics Approach for Crowd Flow Segmentation and Stability Analysis,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2007.
[33] R. Shumway and D. Stoffer, “Dynamic Linear Models with Switching,” J. Am. Statistical Assoc., vol. 86, pp. 763-769, 1991.
[34] Y. Wu, G. Hua, and T. Yu, “Switching Observation Models for Contour Tracking in Clutter,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 295-302, 2003.
[35] M. Isard and A. Blake, “A Mixed-State Condensation Tracker with Automatic Model-Switching,” Proc. Int'l Conf. Computer Vision, pp.107-112, 1998.
[36] V. Pavlović, B.J. Frey, and T.S. Huang, “Time-Series Classification Using Mixed-State Dynamic Bayesian Networks,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 1999.
[37] V. Pavlović, J. Rehg, and J. MacCormick, “Learning Switching Linear Models of Human Motion,” Advances in Neural Information Processing Systems, vol. 13, 2000.
[38] C.-J. Kim, “Dynamic Linear Models with Markov-Switching,” J.Econometrics, vol. 60, pp. 1-22, 1994.
[39] S.M. Oh, J.M. Rehg, T. Balch, and F. Dellaert, “Learning and Inferring Motion Patterns Using Parametric Segmental Switching Linear Dynamic Systems,” Int'l J. Computer Vision, special issue on learning for vision, vol. 77, nos. 1-3, pp. 103-124, 2008.
[40] Z. Ghahramani and G.E. Hinton, “Variational Learning for Switching State-Space Models,” Neural Computation, vol. 12, no. 4, pp. 831-864, 2000.
[41] R.H. Shumway and D.S. Stoffer, “An Approach to Time Series Smoothing and Forecasting Using the EM Algorithm,” J. Time Series Analysis, vol. 3, no. 4, pp. 253-264, 1982.
[42] P.V. Overschee and B.D. Moor, “N4SID: Subspace Algorithms for the Identification of Combined Deterministic-Stochastic Systems,” Automatica, vol. 30, pp. 75-93, 1994.
[43] D. Bauer, “Comparing the CCA Subspace Method to Pseudo Maximum Likelihood Methods in the Case of No Exogenous Inputs,” J. Time Series Analysis, vol. 26, pp. 631-668, 2005.
[44] N. Vasconcelos and A. Lippman, “Empirical Bayesian Motion Segmentation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 2, pp. 217-221, Feb. 2001.
[45] S.M. Kay, Fundamentals of Statistical Signal Processing: Estimation Theory. Prentice-Hall, 1993.
[46] A.P. Dempster, N.M. Laird, and D.B. Rubin, “Maximum Likelihood from Incomplete Data via the EM Algorithm,” J. Royal Statistical Soc. B, vol. 39, pp. 1-38, 1977.
[47] Z. Ghahramani and G. Hinton, “Parameter Estimation for Linear Dynamical Systems,” Technical Report CRG-TR-96-2, Dept. of Computer Science, Univ. of Toronto, 1996.
[48] R.M. Gray, “Vector Quantization” IEEE Trans. Acoustics, Speech, and Signal Processing Magazine, vol. 1, no. 2, pp. 4-29, Apr. 1984.
[49] D.J.C. MacKay, “Introduction to Monte Carlo Methods,” Learning in Graphical Models, pp. 175-204, MIT Press, 1999.
[50] S. Geman and D. Geman, “Stochastic Relaxation, Gibbs Distribution, and the Bayesian Restoration of Images,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 6, no. 6, pp. 721-741, Nov. 1984.
[51] C.M. Bishop, Pattern Recognition and Machine Learning. Springer, 2006.
[52] J. Besag, “Spatial Interaction and the Statistical Analysis of Lattice Systems,” J. Royal Statistical Soc., Series B (Methodological), vol. 36, no. 2, pp. 192-236, 1974.
[53] A. Gunawardana and W. Byrne, “Convergence Theorems for Generalized Alternating Minimization Procedures,” J. Machine Learning Research, vol. 6, pp. 2049-2073, 2005.
[54] L. Hubert and P. Arabie, “Comparing Partitions,” J. Classification, vol. 2, pp. 193-218, 1985.
[55] “Layered Dynamic Textures,” http://people.cs.uchicago.edu/pff/bp/http:/ /www.svcl.ucsd.edu/ projectslayerdytex , 2009.
[56] J. Shi and J. Malik, “Motion Segmentation and Tracking Using Normalized Cuts,” Proc. IEEE Int'l Conf. Computer Vision, pp. 1154-1160, 1999.
[57] “UCF Crowd Motion Database,” http://www.cs.ucf.edu/~sali/ProjectsCrowdSegmentation , 2009.
[58] “Dyntex: A Comprehensive Database of Dynamic Textures,” http://www.cwi.nl/projectsdyntex, 2009.
[59] A. Gelb, Applied Optimal Estimation. MIT Press, 1974.
21 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool