The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.04 - April (2013 vol.35)
pp: 996-1010
Jian Li , Inst. of Autom., Nat. Univ. of Defense Technol., Changsha, China
M. D. Levine , Dept. of Electr. & Comput. Eng., McGill Univ., Montreal, QC, Canada
Xiangjing An , Inst. of Autom., Nat. Univ. of Defense Technol., Changsha, China
Xin Xu , Inst. of Autom., Nat. Univ. of Defense Technol., Changsha, China
Hangen He , Inst. of Autom., Nat. Univ. of Defense Technol., Changsha, China
ABSTRACT
We address the issue of visual saliency from three perspectives. First, we consider saliency detection as a frequency domain analysis problem. Second, we achieve this by employing the concept of nonsaliency. Third, we simultaneously consider the detection of salient regions of different size. The paper proposes a new bottom-up paradigm for detecting visual saliency, characterized by a scale-space analysis of the amplitude spectrum of natural images. We show that the convolution of the image amplitude spectrum with a low-pass Gaussian kernel of an appropriate scale is equivalent to an image saliency detector. The saliency map is obtained by reconstructing the 2D signal using the original phase and the amplitude spectrum, filtered at a scale selected by minimizing saliency map entropy. A Hypercomplex Fourier Transform performs the analysis in the frequency domain. Using available databases, we demonstrate experimentally that the proposed model can predict human fixation data. We also introduce a new image database and use it to show that the saliency detector can highlight both small and large salient regions, as well as inhibit repeated distractors in cluttered images. In addition, we show that it is able to predict salient regions on which people focus their attention.
INDEX TERMS
Strontium, Visualization, Frequency domain analysis, Fourier transforms, Kernel, Computational modeling, Convolution,scale space analysis, Visual attention, saliency, hypercomplex Fourier transform, eye tracking
CITATION
Jian Li, M. D. Levine, Xiangjing An, Xin Xu, Hangen He, "Visual Saliency Based on Scale-Space Analysis in the Frequency Domain", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.35, no. 4, pp. 996-1010, April 2013, doi:10.1109/TPAMI.2012.147
REFERENCES
[1] A. Yarbus, Eye Movements and Vision. Plenum Press, 1967.
[2] U. Neisser, Cognitive Psychology. Appleton-Century-Crofts, 1967.
[3] L. Itti, C. Koch, and E. Niebur, "A Model of Saliency-Based Visual Attention for Rapid Scene Analysis," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 11, pp. 1254-1259, Nov. 1998.
[4] J. Tsotsos, "What Roles Can Attention Play in Recognition?" Proc. Seventh IEEE Int'l Conf. Development and Learning, pp. 55-60, 2008.
[5] N. Bruce and J. Tsotsos, "Saliency Based on Information Maximization," Proc. Advances in Neural Information Processing Systems, 2006.
[6] S. Chikkerur, T. Serre, C. Tan, and T. Poggio, "What and Where: A Bayesian Inference Theory of Attention," Vision Research, vol. 50, pp. 2233-2247, 2010.
[7] J. Harel, C. Koch, and P. Perona, "Graph-Based Visual Saliency," Proc. Advances in Neural Information Processing Systems, 2007.
[8] O. Le Meur, P. Le Callet, D. Barba, and D. Thoreau, "A Coherent Computational Approach to Model Bottom-Up Visual Attention," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, no. 5, pp. 802-817, May 2006.
[9] W. Kienzle, F. Wichmann, B. Schölkopf, and M. Franz, "A Nonparametric Approach to Bottom-Up Visual Saliency," Proc. Advances in Neural Information Processing Systems, vol. 19, pp. 689-696, 2007.
[10] V. Mahadevan and N. Vasconcelos, "Spatiotemporal Saliency in Dynamic Scenes," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 32, no. 1, pp. 171-177, Jan. 2010.
[11] D. Gao, V. Mahadevan, and N. Vasconcelos, "The Discriminant Center-Surround Hypothesis for Bottom-Up Saliency," Proc. Advances in Neural Information Processing Systems, 2008.
[12] M. Cerf, J. Harel, W. Einhaeuser, and C. Koch, "Predicting Human Gaze Using Low-Level Saliency Combined with Face Detection," Proc. Advances in Neural Information Processing Systems, 2008.
[13] J. Tsotsos and A. Rothenstein, "Computational Models of Visual Attention," Scholarpedia, vol. 6, no. 1, p. 6201, 2011.
[14] X. Hou and L. Zhang, "Dynamic Visual Attention: Searching for Coding Length Increments," Proc. Advances in Neural Information Processing Systems, 2009.
[15] L. Itti and P. Baldi, "Bayesian Surprise Attracts Human Attention," Vision Research, vol. 49, pp. 1295-1306, 2009.
[16] L. Itti and C. Koch, "A Saliency-Based Search Mechanism for Overt and Covert Shifts of Visual Attention," Vision Research, vol. 40, nos. 10-12, pp. 1489-1506, 2000.
[17] T. Kadir and M. Brady, "Saliency, Scale and Image Description," Int'l J. Computer Vision, vol. 45, no. 2, pp. 83-105, 2001.
[18] R. Achanta, S. Hemami, F. Estrada, and S. Ssstrunk, "Frequency-Tuned Salient Region Detection," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2009.
[19] T. Avraham and M. Lindenbaum, "Esaliency (Extended Saliency): Meaningful Attention Using Stochastic Image Modeling," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 32, no. 4, pp. 693-708, Apr. 2010.
[20] X. Hou and L. Zhang, "Saliency Detection: A Spectral Residual Approach," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2007.
[21] S. Yantis, "How Visual Salience Wins the Battle for Awareness," Nature Neuroscience, vol. 8, no. 8, pp. 975-977, 2005.
[22] C. Guo and L. Zhang, "A Novel Multiresolution Spatiotemporal Saliency Detection Model and Its Applications in Image and Video Compression," IEEE Trans. Image Processing, vol. 19, no. 1, pp. 185-198, Jan. 2010.
[23] T. Liu, Z. Yuan, J. Sun, J. Wang, N. Zheng, X. Tang, and H. Shum, "Learning to Detect a Salient Object," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 33, no. 2, pp. 353-367, Feb. 2011.
[24] S. Goferman, L. Zelnik-Manor, and A. Tal, "Context-Aware Saliency Detection," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010.
[25] M. Cheng, G. Zhang, N. Mitra, X. Huang, and S. Hu, "Global Contrast Based Salient Region Detection," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2011.
[26] S. Khan, J. van de Weijer, and M. Vanrell, "Top-Down Color Attention for Object Recognition," Proc. IEEE Int'l Conf. Computer Vision, 2009.
[27] X. Hou, J. Harel, and C. Koch, "Image Signature: Highlighting Sparse Salient Regions," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 34, no. 1, pp. 194-201, Jan. 2012.
[28] L. Zhang, M. Tong, T. Marks, H. Shan, and G. Cottrell, "SUN: A Bayesian Framework for Saliency Using Natural Statistics," J. Vision, vol. 8, no. 7,article 32, 2008.
[29] D. Gao, S. Han, and N. Vasconcelos, "Discriminant Saliency, the Detection of Suspicious Coincidences, and Applications to Visual Recognition," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 6, pp. 989-1005, June 2009.
[30] C. Guo, Q. Ma, and L. Zhang, "Spatio-Temporal Saliency Detection Using Phase Spectrum of Quaternion Fourier Transform," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[31] L. Itti and C. Koch, "Computational Modelling of Visual Attention," Nature Rev. Neuroscience, vol. 2, no. 3, pp. 194-203, Mar. 2001.
[32] D. Gao and N. Vasconcelos, "Bottom-Up Saliency Is a Discriminant Process," Proc. IEEE Int'l Conf. Computer Vision, 2007.
[33] D. Gao, V. Mahadevan, and N. Vasconcelos, "On the Plausibility of the Discriminant Center-Surround Hypothesis for Visual Saliency," J. Vision, vol. 8, no. 7, pp. 1-18, 2008.
[34] D. Beck and S. Kastner, "Stimulus Context Modulates Competition in Human Extrastriate Cortex," Nature Neuroscience, vol. 8, no. 8, pp. 1110-1116, 2005.
[35] Z. Yu and H. Wong, "A Rule Based Technique for Extraction of Visual Attention Regions Based on Real-Time Clustering," IEEE Trans. Multimedia, vol. 9, no. 4, pp. 766-784, June 2007.
[36] N. Jacobson, T.Q. Nguyen, and A. Tal, "Video Processing with Scale-Aware Saliency: Application to Frame Rate Up-Conversion," Proc. IEEE Conf. Acoustics, Speech and Signal Processing, 2011.
[37] D. Ruderman, "The Statistics of Natural Images," Network: Computation in Neural Systems, vol. 5, no. 4, pp. 517-548, 1994.
[38] A. Srivastava, A. Lee, E. Simoncelli, and S. Zhu, "On Advances in Statistical Modeling of Natural Images," J. Math. Imaging and Vision, vol. 18, no. 1, pp. 17-33, 2003.
[39] J. Duncan and G. Humphreys, "Visual Search and Stimulus Similarity," Psychological Rev., vol. 96, no. 3, pp. 433-458, 1989.
[40] T. Ell, "Quaternion-Fourier Transforms for Analysis of Two-Dimensional Linear Time-Invariant Partial Differential Systems," Proc. IEEE Conf. Decision and Control, 2002.
[41] T. Ell and S. Sangwine, "Hypercomplex Fourier Transforms of Color Images," IEEE Trans. Image Processing, vol. 16, no. 1, pp. 22-35, Jan. 2007.
[42] A. Abutaleb, "Automatic Thresholding of Gray-Level Pictures Using Two-Dimensional Entropy," Computer Vision, Graphics, and Image Processing, vol. 47, no. 1, pp. 22-32, 1989.
[43] W. Chen, C. Wen, and C. Yang, "A Fast Two-Dimensional Entropic Thresholding Algorithm," Pattern Recognition, vol. 27, no. 7, pp. 885-893, 1994.
[44] T. Judd, F. Durand, and A. Torralba, "A Benchmark of Computational Models of Saliency to Predict Human Fixations," IEEE Trans. Pattern Analysis and Machine Intelligence, in review.
[45] L. Elazary and L. Itti, "Interesting Objects Are Visually Salient," J. Vision, vol. 8, no. 3, pp. 1-15, 2008.
[46] T. Veit, J. Tarel, P. Nicolle, and P. Charbonnier, "Evaluation of Road Marking Feature Extraction," Proc. IEEE Conf. Intelligent Transportation Systems, 2008.
[47] W. Einhauser, M. Spain, and P. Perona, "Objects Predict Fixations Better than Early Saliency," J. Vision, vol. 8, no. 14, pp. 1-26, 2008.
79 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool