Issue No.01 - January (2010 vol.32)
pp: 171-177
Vijay Mahadevan , University of California, San Diego, La Jolla
A spatiotemporal saliency algorithm based on a center-surround framework is proposed. The algorithm is inspired by biological mechanisms of motion-based perceptual grouping and extends a discriminant formulation of center-surround saliency previously proposed for static imagery. Under this formulation, the saliency of a location is equated to the power of a predefined set of features to discriminate between the visual stimuli in a center and a surround window, centered at that location. The features are spatiotemporal video patches and are modeled as dynamic textures, to achieve a principled joint characterization of the spatial and temporal components of saliency. The combination of discriminant center-surround saliency with the modeling power of dynamic textures yields a robust, versatile, and fully unsupervised spatiotemporal saliency algorithm, applicable to scenes with highly dynamic backgrounds and moving cameras. The related problem of background subtraction is treated as the complement of saliency detection, by classifying nonsalient (with respect to appearance and motion dynamics) points in the visual field as background. The algorithm is tested for background subtraction on challenging sequences, and shown to substantially outperform various state-of-the-art techniques. Quantitatively, its average error rate is almost half that of the closest competitor.
Spatiotemporal saliency, background subtraction, dynamic backgrounds, motion saliency, dynamic texture, discriminant center-surround architecture, video modeling.
Vijay Mahadevan, "Spatiotemporal Saliency in Dynamic Scenes", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.32, no. 1, pp. 171-177, January 2010, doi:10.1109/TPAMI.2009.112
[1] , 2009.
[2] , 2009.
[3] M.M. Bence, P. Ölveczky, and S.A. Baccus, “Segregation of Object and Background Motion in the Retina,” Nature, vol. 423, pp. 401-408, 2003.
[4] R.T. Born, J. Groh, R. Zhao, and S.J. Lukasewycz, “Segregation of Object and Background Motion in Visual Area MT: Effects of Microstimulation on Eye Movements,” Neuron, vol. 26, pp. 725-734, 2000.
[5] T. Boult, “Coastal Surveillance Datasets,” Vision and Security Lab, Univ. of Colorado at Colorado Springs,, 2005.
[6] A. Bugeau and P. Perez, “Detection and Segmentation of Moving Objects in Highly Dynamic Scenes,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2007.
[7] A.B. Chan and N. Vasconcelos, “Efficient Computation of the kl Divergence between Dynamic Textures,” Technical Report SVCL-TR-2004-02, Dept. of Electrical and Computer Eng., Univ. of California, San Diego, 2004.
[8] A.B. Chan and N. Vasconcelos, “Probabilistic Kernels for the Classification of Auto-Regressive Visual Processes,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 846-851, 2005.
[9] T. Cover and J. Thomas, Elements of Information Theory. John Wiley & Sons, 1991.
[10] R. Cucchiara, C. Grana, M. Piccardi, and A. Prati, “Detecting Moving Objects, Ghosts, and Shadows in Video Streams,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 10, pp. 1337-1342, Oct. 2003.
[11] G. Doretto, A. Chiuso, Y.N. Wu, and S. Soatto, “Dynamic Textures,” Int'l J. Computer Vision, vol. 51, no. 2, pp. 91-109, 2003.
[12] A. Elgammal, R. Duraiswami, D. Harwood, and L.S. Davis, “Background and Foreground Modeling Using Nonparametric Kernel Density for Visual Surveillance,” Proc. IEEE, vol. 90, no. 7, pp. 1151-1163, July 2002.
[13] D. Gao and N. Vasconcelos, “Decision-Theoretic Saliency: Computational Principle, Biological Plausibility, and Implications for Neurophysiology and Psychophysics,” Neural Computation, vol. 21, pp. 239-271, 2007.
[14] E. Hayman and J. Eklundh, “Statistical Background Subtraction for a Mobile Observer,” Proc. Int'l Conf. Computer Vision, 2003.
[15] D.H. Hubel and T.N. Wiesel, “Receptive Fields and Functional Architecture in Two Nonstriate Visual Areas (18 and 19) of the Cat,” J. Neurophysiology, vol. 28, pp. 229-289, 1965.
[16] M. Irani, B. Rousso, and S. Peleg, “Computing Occluding and Transparent Motions,” Int'l J. Computer Vision, vol. 12, pp. 5-16, 1994.
[17] L. Itti, “The iLab Neuromorphic Vision C++ Toolkit: Free Tools for the Next Generation of Vision Algorithms,” The Neuromorphic Eng., vol. 1, no. 1, p. 10, Mar. 2004.
[18] L. Itti and P. Baldi, “A Principled Approach to Detecting Surprising Events in Video,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 631-637, 2005.
[19] L. Itti and C. Koch, “A Saliency-Based Search Mechanism for Overt and Covert Shifts of Visual Attention,” Vision Research, vol. 40, pp. 1489-1506, 2000.
[20] S. Kullback, Information Theory and Statistics. Dover Publications, 1968.
[21] A. Monnet, A. Mittal, N. Paragios, and V. Ramesh, “Background Modeling and Subtraction of Dynamic Scenes,” Proc. IEEE Int'l Conf. Computer Vision, pp. 1305-1312, 2003.
[22] H.C. Nothdurft, “The Role of Features in Preattentive Vision: Comparison of Orientation, Motion and Color Cues,” Vision Research, vol. 33, no. 14, pp.1937-1958, 1993.
[23] Y. Ren, C. Chua, and Y. Ho, “Motion Detection with Nonstationary Background,” Machine Vision and Applications, vol. 13, nos. 5-6, pp. 332-343, 2003.
[24] Y. Sheikh and M. Shah, “Bayesian Modeling of Dynamic Scenes for Object Detection,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 11, pp. 1778-1792, Nov. 2005.
[25] C. Stauffer and W. Grimson, “Adaptive Background Mixture Models for Real-Time Tracking,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 2246-2252, 1999.
[26] N. Vasconcelos, “Feature Selection by Maximum Marginal Diversity: Optimality and Implications for Visual Recognition,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 762-769, 2003.
[27] R. Visser, N. Sebe, and E. Bakker, “Object Recognition for Video Retrieval,” Proc. Int'l Conf. Image and Video Retrieval, pp. 250-259, 2002.
[28] L. Wixson, “Detecting Salient Motion by Accumulating Directionally-Consistent Flow,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 774-780, Aug. 2000.
[29] C.R. Wren, A. Azarbayejani, T. Darrell, and A. Pentland, “Pfinder: Real-Time Tracking of the Human Body,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 780-785, July 1997.
[30] A. Yilmaz, O. Javed, and M. Shah, “Object Tracking: A Survey,” ACM Computing Surveys, vol. 38, no. 4, p. 13, 2006.
[31] J. Zhong and S. Sclaroff, “Segmenting Foreground Objects from a Dynamic Textured Background via a Robust Kalman Filter,” Proc. IEEE Int'l Conf. Computer Vision, vol. 1, pp. 44-50, 2003.
[32] Z. Zivkovic, “Improved Adaptive Gaussian Mixture Model for Background Subtraction,” Proc. Int'l Conf. Pattern Recognition, 2004.