The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.02 - Feb. (2013 vol.35)
pp: 342-353
A. Ravichandran , UCLA Vision Lab., Univ. of California, Los Angeles, Los Angeles, CA, USA
R. Chaudhry , Center for Imaging Sci., Johns Hopkins Univ., Baltimore, MD, USA
R. Vidal , Center for Imaging Sci., Johns Hopkins Univ., Baltimore, MD, USA
ABSTRACT
We consider the problem of categorizing video sequences of dynamic textures, i.e., nonrigid dynamical objects such as fire, water, steam, flags, etc. This problem is extremely challenging because the shape and appearance of a dynamic texture continuously change as a function of time. State-of-the-art dynamic texture categorization methods have been successful at classifying videos taken from the same viewpoint and scale by using a Linear Dynamical System (LDS) to model each video, and using distances or kernels in the space of LDSs to classify the videos. However, these methods perform poorly when the video sequences are taken under a different viewpoint or scale. In this paper, we propose a novel dynamic texture categorization framework that can handle such changes. We model each video sequence with a collection of LDSs, each one describing a small spatiotemporal patch extracted from the video. This Bag-of-Systems (BoS) representation is analogous to the Bag-of-Features (BoF) representation for object recognition, except that we use LDSs as feature descriptors. This choice poses several technical challenges in adopting the traditional BoF approach. Most notably, the space of LDSs is not euclidean; hence, novel methods for clustering LDSs and computing codewords of LDSs need to be developed. We propose a framework that makes use of nonlinear dimensionality reduction and clustering techniques combined with the Martin distance for LDSs to tackle these issues. Our experiments compare the proposed BoS approach to existing dynamic texture categorization methods and show that it can be used for recognizing dynamic textures in challenging scenarios which could not be handled by existing methods.
INDEX TERMS
Video sequences, Feature extraction, Spatiotemporal phenomena, Measurement, Heuristic algorithms, Observability, Training,linear dynamical systems, Dynamic textures, categorization
CITATION
A. Ravichandran, R. Chaudhry, R. Vidal, "Categorizing Dynamic Textures Using a Bag of Dynamical Systems", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.35, no. 2, pp. 342-353, Feb. 2013, doi:10.1109/TPAMI.2012.83
REFERENCES
[1] Z. Bar-Joseph, R. El-Yaniv, D. Lischinski, and M. Werman, "Texture Mixing and Texture Movie Synthesis Using Statistical Learning," IEEE Trans. Visualization and Computer Graphics, vol. 7, no. 2, pp. 120-135, Apr.-June 2001.
[2] H. Bay, T. Tuytelaars, and L. Van Gool, "Surf: Speeded Up Robust Features," Proc. European Conf. Computer Vision, May 2006.
[3] D. Blei, A. Ng, and M. Jordan, "Latent Dirichlet Allocation," J. Machine Learning Research, vol. 3, pp. 993-1022, 2003.
[4] A. Chan and N. Vasconcelos, "Classifying Video with Kernel Dynamic Textures," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1-6, 2007.
[5] A. Chan and N. Vasconcelos, "Probabilistic Kernels for the Classification of Auto-Regressive Visual Processes," Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 846-851, 2005.
[6] K.D. Cock and B.D. Moor, "Subspace Angles and Distances between ARMA Models," System and Control Letters, vol. 46, no. 4, pp. 265-270, 2002.
[7] T.F. Cox and M.A.A. Cox, Multidimensional Scaling. Chapman and Hall, 1994.
[8] C. Dance, J. Willamowski, L. Fan, C. Bray, and G. Csurka, "Visual Categorization with Bags of Keypoints," Proc. European Conf. Computer Vision, 2004.
[9] P. Dollár, V. Rabaud, G. Cottrell, and S. Belongie, "Behavior Recognition via Sparse Spatio-Temporal Features," Proc. IEEE Int'l Workshop Visual Surveillance and Performance Evaluation of Tracking and Surveillance, Oct. 2005.
[10] G. Doretto, A. Chiuso, Y. Wu, and S. Soatto, "Dynamic Textures," Int'l J. Computer Vision, vol. 51, no. 2, pp. 91-109, 2003.
[11] G. Doretto, D. Cremers, P. Favaro, and S. Soatto, "Dynamic Texture Segmentation," Proc. IEEE Conf. Computer Vision, pp. 44-49, 2003.
[12] R. Duda, P. Hart, and D. Stork, Pattern Classification. Wiley-Interscience, Oct. 2004.
[13] K. Fujita and S. Nayar, "Recognition of Dynamic Textures Using Impulse Responses of State Variables," Proc. Third Int'l Workshop Texture Analysis and Synthesis, Oct. 2003.
[14] A. Ghoreyshi and R. Vidal, "Segmenting Dynamic Textures with Ising Descriptors, ARX Models and Level Sets," Proc. Int'l Workshop Dynamic Vision, pp. 127-141, 2006.
[15] T. Hofmann, "Probabilistic Latent Semantic Analysis," Proc. Uncertainty in Artificial Intelligence, 1999.
[16] I. Jolliffe, Principal Component Analysis, second ed. Springer-Verlag, 2002.
[17] L. Kaufman and P. Rousseeuw, Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, 1990.
[18] A. Kläser, M. Marszałek, and C. Schmid, "A Spatio-Temporal Descriptor Based on 3D-Gradients," Proc. British Machine Vision Conf., pp. 995-1004, 2008.
[19] V. Kwatra, A. Schödl, I. Essa, G. Turk, and A. Bobick, "Graphcut Textures: Image and Video Synthesis Using Graph Cuts," ACM Trans. Graphics, vol. 22, pp. 277-286, 2003.
[20] I. Laptev, "On Space-Time Interest Points," Int'l J. Computer Vision, vol. 64, nos. 2/3, pp. 107-123, 2005.
[21] S. Lazebnik, C. Schmid, and J. Ponce, "Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories," Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 2169-2178, 2006.
[22] D. Nister and H. Stewenius, "Scalable Recognition with a Vocabulary Tree," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 2161-2168, 2006.
[23] P.V. Overschee and B.D. Moor, "N4SID : Subspace Algorithms for the Identification of Combined Deterministic-Stochastic Systems," Automatica, vol. 30, pp. 75-93, 1994.
[24] A. Ravichandran, R. Chaudhry, and R. Vidal, "View-Invariant Dynamic Texture Recognition Using a Bag of Dynamical Systems," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2009.
[25] A. Ravichandran, P. Favaro, and R. Vidal, "A Unified Approach to Segmentation and Categorization of Dynamic Textures," Proc. Asian Conf. Computer Vision, pp. 425-438, 2010.
[26] S. Roweis and L. Saul, "Nonlinear Dimensionality Reduction by Locally Linear Embedding," Science, vol. 290, no. 5500, pp. 2323-2326, 2000.
[27] P. Saisan, G. Doretto, Y.N. Wu, and S. Soatto, "Dynamic Texture Recognition," Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 58-63, 2001.
[28] A. Schödl, R. Szeliski, D.H. Salesin, and I. Essa, "Video Textures," Proc. ACM Siggraph, pp. 489-498, 2000.
[29] R. Shumway and D. Stoffer, "An Approach to Time Series Smoothing and Forecasting Using the EM Algorithm," J. Time Series Analysis, vol. 3, no. 4, pp. 253-264, 1982.
[30] J. Sivic and A. Zisserman, "Video Google: A Text Retrieval Approach to Object Matching in Videos," Proc. IEEE Int'l Conf. Computer Vision, pp. 1470-1477, 2003.
[31] M. Szummer and R.W. Picard, "Temporal Texture Modeling," Proc. IEEE Int'l Conf. Image Processing, vol. 3, pp. 823-826, 1996.
[32] J.B. Tenenbaum, V. de Silva, and J.C. Langford, "A Global Geometric Framework for Nonlinear Dimensionality Reduction," Science, vol. 290, no. 5500, pp. 2319-2323, 2000.
[33] R. Vidal and P. Favaro, "Dynamicboost: Boosting Time Series Generated by Dynamical Systems," Proc. IEEE Int'l Conf. Computer Vision, 2007.
[34] S. Vishwanathan, A. Smola, and R. Vidal, "Binet-Cauchy Kernels on Dynamical Systems and Its Application to the Analysis of Dynamic Scenes," Int'l J. Computer Vision, vol. 73, no. 1, pp. 95-119, 2007.
[35] H. Wang, M.M. Ullah, A. Kläser, I. Laptev, and C. Schmid, "Evaluation of Local Spatio-Temporal Features for Action Recognition," Proc. British Machine Vision Conf., p. 127, Sept. 2009.
[36] L. Wei and M. Levoy, "Fast Texture Synthesis Using Tree-Structured Vector Quantization," Proc. ACM Siggraph, pp. 479-488, 2000.
[37] G. Willems, T. Tuytelaars, and L.J.V. Gool, "An Efficient Dense and Scale-Invariant Spatio-Temporal Interest Point Detector," Proc. European Conf. Computer Vision, 2008.
[38] S.-F. Wong and R. Cipolla, "Extracting Spatiotemporal Interest Points Using Global Information," Proc. IEEE Int'l Conf. Computer Vision, pp. 1-8, 2007.
[39] F. Woolfe and A. Fitzgibbon, "Shift-Invariant Dynamic Texture Recognition," Proc. European Conf. Computer Vision, pp. 549-562, 2006.
61 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool