This Article 
 Bibliographic References 
 Add to: 
Analysis and Synthesis of Textured Motion: Particles and Waves
October 2004 (vol. 26 no. 10)
pp. 1348-1363
Natural scenes contain a wide range of textured motion phenomena which are characterized by the movement of a large amount of particle and wave elements, such as falling snow, wavy water, and dancing grass. In this paper, we present a generative model for representing these motion patterns and study a Markov chain Monte Carlo algorithm for inferring the generative representation from observed video sequences. Our generative model consists of three components. The first is a photometric model which represents an image as a linear superposition of image bases selected from a generic and overcomplete dictionary. The dictionary contains Gabor and LoG bases for point/particle elements and Fourier bases for wave elements. These bases compete to explain the input images and transfer them to a token (base) representation with an O(10^2)\hbox{-}{\rm{fold}} dimension reduction. The second component is a geometric model which groups spatially adjacent tokens (bases) and their motion trajectories into a number of moving elements—called "motons.” A moton is a deformable template in time-space representing a moving element, such as a falling snowflake or a flying bird. The third component is a dynamic model which characterizes the motion of particles, waves, and their interactions. For example, the motion of particle objects floating in a river, such as leaves and balls, should be coupled with the motion of waves. The trajectories of these moving elements are represented by coupled Markov chains. The dynamic model also includes probabilistic representations for the birth/death (source/sink) of the motons. We adopt a stochastic gradient algorithm for learning and inference. Given an input video sequence, the algorithm iterates two steps: 1) computing the motons and their trajectories by a number of reversible Markov chain jumps, and 2) learning the parameters that govern the geometric deformations and motion dynamics. Novel video sequences are synthesized from the learned models and, by editing the model parameters, we demonstrate the controllability of the generative model.

[1] Z. Bar-Joseph, R. El-Yaniv, D. Lischinski, and M. Werman, Texture Mixing and Texture Movie Synthesis Using Statistical Learning IEEE Trans. Visualization and Computer Graphics, vol. 7, 2001.
[2] C. Bregler, M. Covell, and M. Slaney, Video Rewrite: Driving Visual Speech with Audio Proc. SIGGRAPH, 2000.
[3] A. Cliff and J. Ord, Space-Time Modeling with an Application to Regional Forecasting Trans. Inst. British Geographers, vol. 66, pp. 119-128, 1975.
[4] D. Ebert and R. Parent, Rendering and Animation of Gaseous Phenomena by Combining Fast Volume and Scaleline A-Buffer Techniques Proc. SIGGRAPH, 1990.
[5] A. Efros and T. Leung, “Texture Synthesis by Non-Parametric Sampling,” Proc. Seventh Int'l Conf. Computer Vision, 1999.
[6] D. Field, What Is the Goal of Sensory Coding? Neural Computation, vol. 6, pp. 559-601, 1994.
[7] A. Fitzgibbon, Stochastic Rigidity: Image Registration for Nowhere-Static Scenes Proc. Seventh IEEE Int'l Conf. Computer Vision, pp. 662-670, July 2001.
[8] D.J. Fleet and A.D. Jepson, “Stability of Phase Information,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, pp. 1253–1268, 1993.
[9] M.G. Gu and F.H. Kong, A Stochastic Approximation Algorithm with Markov Chain Monte-Carlo Method for Incomplete Data Estimation Problems Proc. Nat'l Academy of Sciences, vol. 95, pp. 7270-74, 1998.
[10] D. Heeger and J. Bergen, Pyramid-Based Texture Analysis and Synthesis Proc. SIGGRAPH, 1995.
[11] B. Horn and B. Schunck, Determining Optical Flow Aritificial Intelligence, vol. 17, pp. 185-203, 1981.
[12] M. Isard and A. Blake, Contour Tracking by Stochastic Propagation of Conditional Density Proc. European Conf. Computer Vision, 1996.
[13] B. Julesz, Textons, the Elements of Texture Perception and Their Interactions Nature, vol. 290, pp. 91-97, 1981.
[14] R. Kailath, Linear Systems. Englewood Cliffs, N.J.: Prentice Hall, 1980.
[15] R. Mann and M.S. Langer, Optical Snow and the Aperture Problem Proc. Int'l Conf. Pattern Recognition, vol. IV, pp. 264-267, Aug. 2002.
[16] B.F. Logan Jr., Information in the Zero-Crossings of Band Pass Signals Bell Systems Technical J. , vol. 56, pp. 487-510, 1977.
[17] S. Mallat and Z. Zhang, Matching Pursuit with Time-Frequency Dictionaries IEEE Trans. Signal Processing, vol. 41, no. 12, pp. 3397-3415, Dec. 1993.
[18] D. Marr, Vision. I.W.H. Freeman, 1983.
[19] N. Metropolis, M. Rosenbluth, A. Rosenbluth, A. Teller, and E. Teller, Equations of State Calculations by Fast Computing Machines J. Chemical Physics, vol. 21, pp. 1087-1092, 1953.
[20] W.T. Reeves and R. Blau, Approximate and Probabilistic Algorithms for Shading and Rendering Structured Particle Systems Proc. SIGGRAPH, 1985.
[21] P. Saisan, G. Doretto, Y. Wu, and S. Soatto, Dynamic Texture Recognition Proc. IEEE Conf. Computer Vision and Pattern Recognition, Dec. 2001.
[22] A. Schodl, R. Szeliski, D. Salesin, and I. Essa, Video Textures Proc. SIGGRAPH, 2000.
[23] A. Schodl and I. Essa, Controlled Animation of Video Sprites Proc. ACM Symp. Computer Animation, 2002.
[24] Y. Li, T. Wang, and H.Y. Shum, Motion Texture: A Two-Level Statistical Model for Character Motion Synthesis Proc. SIGGRAPH, 2002.
[25] S. Soatto, G. Doretto, and Y. Wu, Dynamic Texture Proc. Int'l Conf. Computer Vision, 2001.
[26] P. Stoica and R. Moses, Introduction to Spectral Analysis. Prentice Hall, 1997.
[27] M. Szummer and R.W. Picard, “Temporal Texture Modeling,” Proc. IEEE Int'l Conf. Image Processing (ICIP 1996), 1996, also appeared as MIT Media Lab Perceptual Computing TR #381.
[28] R.A.R. Tricker, Bores, Breakers, Waves and Wakes. New York: Am. Elsevier, 1965.
[29] Z.W. Tu and S.C. Zhu, Image Segmentation by Data-Driven Markov Chain Monte Carlo IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 5, pp. 657-673, May 2002.
[30] Y. Wang and S. Zhu, A Generative Method for Textured Motion: Analysis and Synthesis Proc. European Conf. Computer Vision, 2002.
[31] L. Wei and M. Levoy, Fast Texture Synthesis Using Tree-Structured Vector Quantization Proc. SIGGRAPH, 2000.
[32] S. Zhu, C. Guo, Y. Wang, and Z.J. Xu, What Are Textons? Int'l J. Computer Vision, (to appear). A short version appeared in ECCV, 2002.
[33] S. Zhu, Y. Wu, and D. Mumford, Minimax Entropy Principle and Its Applications to Texture Modeling Neural Computation, vol. 9, pp. 1627-1660, 1997.

Index Terms:
Textured motion, generative model, texton, statistical learning, object tracking, stochastic gradient.
Yizhou Wang, Song-Chun Zhu, "Analysis and Synthesis of Textured Motion: Particles and Waves," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 10, pp. 1348-1363, Oct. 2004, doi:10.1109/TPAMI.2004.76
Usage of this product signifies your acceptance of the Terms of Use.