This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Space-Time Behavior-Based Correlation—OR—How to Tell If Two Underlying Motion Fields Are Similar Without Computing Them?
November 2007 (vol. 29 no. 11)
pp. 2045-2056
We introduce a behavior-based similarity measure which tells us whether two different space-time intensity patterns of two different video segments could have resulted from a similar underlying motion field. This is done directly from the intensity information, without explicitly computing the underlying motions. Such a measure allows us to detect similarity between video segments of differently dressed people performing the same type of activity. It requires no foreground/background segmentation, no prior learning of activities, and no motion estimation or tracking. Using this behavior-based similarity measure, we extend the notion of 2-dimensional image correlation into the 3-dimensional space-time volume, thus allowing to correlate dynamic behaviors and actions. Small space-time video segments (small video clips) are "correlated" against entire video sequences in all three dimensions (x,y, and t). Peak correlation values correspond to video locations with similar dynamic behaviors. Our approach can detect very complex behaviors in video sequences (e.g., ballet movements, pool dives, running water), even when multiple complex activities occur simultaneously within the field-of-view of the camera. We further show its robustness to small changes in scale and orientation of the correlated behavior.

[1] J. Bigün and G. Granlund, “Optical Flow Based on the Inertia Matrix of the Frequency Domain,” Proc. Swedish Symp. Image Analysis (SSAB) Symp. Picture Processing, Mar. 1988.
[2] M.J. Black, “Explaining Optical Flow Events with Parameterized Spatio-Temporal Models,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 1326-1332, June 1999.
[3] M. Blank, L. Gorelick, E. Shechtman, M. Irani, and R. Basri, “Actions as Space-Time Shapes,” Proc. Int'l Conf. Computer Vision, pp. 1395-1402, Oct. 2005.
[4] A. Bobick and J. Davis, “The Recognition of Human Movement Using Temporal Templates,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 3, pp. 257-267, Mar. 2001.
[5] C. Bregler, “Learning and Recognizing Human Dynamics in Video Sequences,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, June 1997.
[6] Y. Caspi and M. Irani, “A Step towards Sequence-to-Sequence Alignment,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 682-689, June 2000.
[7] O. Chomat and J.L. Crowley, “Probabilistic Sensor for the Perception of Activities,” Proc. European Conf. Computer Vision, 2000.
[8] P. Dollár, V. Rabaud, G. Cottrell, and S. Belongie, “Behavior Recognition via Sparse Spatio-Temporal Features,” Proc. Second Joint IEEE Int'l Workshop Visual Surveillance and Performance Evaluation of Tracking and Surveillance, Oct. 2005.
[9] A.A. Efros, A.C. Berg, G. Mori, and J. Malik, “Recognizing Action at a Distance,” Proc. Int'l Conf. Computer Vision, Oct. 2003.
[10] D. Feldman and D. Weinshall, “Motion Segmentation Using an Occlusion Detector,” Proc. ECCV Workshop Dynamical Vision, 2006.
[11] G. Golub and C.V. Loan, Matrix Computations, third ed. Johns Hopkins Univ. Press, 1996.
[12] C. Harris and M. Stephens, “A Combined Corner and Edge Detector,” Proc. Fourth Alvey Vision Conf., pp. 147-151, 1988.
[13] B. Jähne, H. Haußecker, and P. Geißler, Handbook of Computer Vision and Application, vol. 2. Academic Publishers, 1999.
[14] Y. Ke, R. Sukthankar, and M. Hebert, “Efficient Visual Event Detection Using Volumetric Features,” Proc. Int'l Conf. Computer Vision, pp. 166-173, 2005.
[15] I. Laptev, “On Space-Time Interest Points,” Int'l J. Computer Vision, vol. 64, no. 2/3, pp. 107-123, 2005.
[16] I. Laptev and T. Lindeberg, “Space-Time Interest Points,” Proc. Int'l Conf. Computer Vision, 2003.
[17] T. Lindeberg, A. Akbarzadeh, and I. Laptev, “Galilean-Diagonalized Spatio-Temporal Interest Operators,” Proc. Int'l Conf. Pattern Recognition, vol. 1, pp. 57-62, 2004.
[18] B. Lucas and T. Kanade, “An Iterative Image Registration Technique with an Application to Stereo Vision,” Proc. Image Understanding Workshop, pp. 121-130, 1981.
[19] J.C. Niebles, H. Wang, and L. Fei-Fei, “Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words,” Proc. British Machine Vision Conf., 2006.
[20] S.A. Niyogi and E.H. Adelson, “Analyzing and Recognizing Walking Figures in xyt,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, June 1994.
[21] C. Rao, A. Yilmaz, and M. Shah, “View Invariant Representation and Recognition of Actions,” Int'l J. Computer Vision, vol. 50, no. 2, pp. 203-226, 2002.
[22] E. Shechtman and M. Irani, “Space-Time Behavior Based Correlation,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 405-412, June 2005.
[23] H. Spies and H. Scharr, “Accurate Optical Flow in Noisy Image Sequences,” Proc. Int'l Conf. Computer Vision, vol. 1, pp. 587-592, July 2001.
[24] J. Sullivan and S. Carlsson, “Recognizing and Tracking Human Action,” Proc. European Conf. Computer Vision, vol. 1, pp. 629-644, 2002.
[25] Y. Ukrainitz and M. Irani, “Aligning Sequences and Actions by Maximizing Space-Time Correlations,” Proc. European Conf. Computer Vision, May 2006.
[26] Y. Yacoob and M.J. Black, “Parametrized Modeling and Recognition of Activities,” Computer Vision and Image Understanding, vol. 73, no. 2, pp. 232-247, 1999.
[27] A. Yilmaz and M. Shah, “Actions Sketch: A Novel Action Representation,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 984-989, 2005.
[28] L. Zelnik-Manor and M. Irani, “Event-Based Analysis of Video,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 123-130, Sept. 2001.
[29] L. Zelnik-Manor and M. Irani, “Statistical Analysis of Dynamic Actions,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, no. 9, pp. 1530-1535, Sept. 2006.
[30] www.wisdom.weizmann.ac.il/~visionBehaviorCorrelation. html , 2007.

Index Terms:
Space-time analysis, motion analysis, action recognition, motion similarity measure, template matching, video correlation, video indexing, video browsing
Citation:
Eli Shechtman, Michal Irani, "Space-Time Behavior-Based Correlation—OR—How to Tell If Two Underlying Motion Fields Are Similar Without Computing Them?," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 11, pp. 2045-2056, Nov. 2007, doi:10.1109/TPAMI.2007.1119
Usage of this product signifies your acceptance of the Terms of Use.