This Article 
 Bibliographic References 
 Add to: 
A Coherent Computational Approach to Model Bottom-Up Visual Attention
May 2006 (vol. 28 no. 5)
pp. 802-817
Visual attention is a mechanism which filters out redundant visual information and detects the most relevant parts of our visual field. Automatic determination of the most visually relevant areas would be useful in many applications such as image and video coding, watermarking, video browsing, and quality assessment. Many research groups are currently investigating computational modeling of the visual attention system. The first published computational models have been based on some basic and well-understood Human Visual System (HVS) properties. These models feature a single perceptual layer that simulates only one aspect of the visual system. More recent models integrate complex features of the HVS and simulate hierarchical perceptual representation of the visual input. The bottom-up mechanism is the most occurring feature found in modern models. This mechanism refers to involuntary attention (i.e., salient spatial visual features that effortlessly or involuntary attract our attention). This paper presents a coherent computational approach to the modeling of the bottom-up visual attention. This model is mainly based on the current understanding of the HVS behavior. Contrast sensitivity functions, perceptual decomposition, visual masking, and center-surround interactions are some of the features implemented in this model. The performances of this algorithm are assessed by using natural images and experimental measurements from an eye-tracking system. Two adequate well-known metrics (correlation coefficient and Kullbacl-Leibler divergence) are used to validate this model. A further metric is also defined. The results from this model are finally compared to those from a reference bottom-up model.

[1] W. James, The Principles of Psychology. New York: Holt, 1890.
[2] U. Rajashekar, L.K. Cormack, and A.C. Bovik, “Point of Gaze Analysis Reveals Visual Search Strategies,” Proc. SPIE Human Vision and Electronic Imaging IX, 2004.
[3] P. Reinagel and A.M. Zador, “Natural Scene Statistics at the Centre of Gaze,” Network: Computational Neural Systems, 10, pp. 1-10, 1999.
[4] D.J. Parkhurst and E. Niebur, “Scene Contant Selected by Active Vision,” Spatial Vision, vol. 16, pp. 125-154, 2003.
[5] M. Mack, M.S. Castelhano, J.M. Henderson, and A. Oliva, “What the Visual System Sees: The Relationship between Fixation Positions and Image Properties During a Search Task in Real-World Scenes,” Proc. Ann. Object, Perception, Attention, and Memory Conf., 2003.
[6] J. Tsotsos and S.M. Culhane, “Modeling Visual Attention via Selective Tuning,” Artificial Intelligence 78, pp. 507-545, 1995.
[7] L. Itti, C. Koch, and E. Niebur, “A Model of Saliency-Based Visual Attention for Rapid Scene Analysis,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 11, pp. 1254-1259, Nov. 1998.
[8] L. Itti and C. Koch, “A Comparison of Feature Combination Strategies for Saliency-Based Visual Attention Systems,” Proc. SPIE Human Vision and Electronic Imaging IV, vol. 3644, pp. 373-382, 1999.
[9] L. Itti and C. Koch, “A Saliency-Based Search Mechanism for Overt and Covert Shifts of Visual Attention,” Vision Research, vol. 40, nos. 10-12, pp. 1489-1506, 2000.
[10] B. Bruce and E. Jernigan, “Evolutionary Design of Context-Free Attentional Operators,” Proc. Int'l Conf. Image Processing '03, 2003.
[11] R.L. Canosa, “High-Level Aspects of Oculomotor Control During Viewing of Natural-Task Images,” Proc. SPIE Human Vision and Electronic Imaging VIII, vol. 5007, 2003.
[12] A.M. Treisman and G. Gelade, “A Feature-Integration Theory of Attention,” Cognitive Psychology, vol. 12, no. 1, pp. 97-136, 1980.
[13] C. Koch and S. Ullman, “Shifts in Selection in Visual Attention: Toward the Underlying Neural Circuitry,” Human Neurobiology, vol. 4, no. 4, pp. 219-27, 1985.
[14] D. Parkhurst, K. Law, and E. Niebur, “Modeling the Role of Salience in the Allocation of Overt Visual Attention,” Vision Research, vol. 42, pp. 107-123, 2002.
[15] J.I. Nelson and B.J. Frost, “Intracortical Facilitation among Co-Oriented, Co-Axially Aligned Simple Cells in Cat Striate Cortex,” Experimental Brain Research, vol. 61, no. 1, pp. 54-61, 1985.
[16] L. Bedat, A. Saadane, and D. Barba, “Masking Effects of Perceptual Color Components on Achromatic Grating,” Proc. European Conf. Visual Perception, 1997.
[17] H. Senane, A. Saadane, and D. Barba, “Visual Bandwiths Estimated by Masking,” Proc. Eighth IEEE Workshop Image and Multidimensional Signal Processing, 1993.
[18] O. Le Meur, P. Le Callet, D. Barba, and D. Thoreau, “Masking Effect in Visual Attention Modeling,” Proc. Workshop Image Analysis for Multimedia Interactive Services, Apr. 2004.
[19] M.K Kapadia, M. Ito, C.D. Gilbert, and G. Westheimer, “Improvement in Visual Sensitivity by Changes in Local Context: Parallel Studies in Human Observers and in V1 of Alert Monkeys,” Neuron, vol. 15, no. 4, pp. 843-856, 1995.
[20] M.K Kapadia, G. Westheimer, and C.D. Gilbert, “Spatial Distribution of Contextual Interactions in Primary Visual Cortex and in Visual Perception,” J. Neurophysiology, vol. 84, no. 4, pp. 2048-2062, 2000.
[21] Z. Li, “A Neural Model of Contour Integration in the Primary Visual Cortex,” Neural Computation, vol. 10, no. 4, pp. 903-940, 1998.
[22] Z. Li, “Pre-Attentive Segmentation in the Primary Visual Cortex,” Spatial Vision, vol. 13, pp. 25-50, 1999.
[23] S. Grossberg and E. Mingolla, “Neural Dynamics of Perceptual Grouping: Textures, Boundaries, and Emergent Segmentation,” Perception and Psychophysics, vol. 38, pp. 141-171, 1985.
[24] J.M. Henderson, P.A. Weeks, and A. Hollingworth, “Effects of Semantic Consistency on Eye Movements During Scene Viewing,” J. Experimental Psychology: Human Perception and Performance, vol. 25, no. 210, 1999.
[25] S. Yantis and J. Jonidas, “Attentional Capture by Abrupt Onsets and Selective Attention: Evidence from Visual Search,” J. Experimental Psychology: Human Perception Performance, vol. 20, pp. 1505-1513, 1996.
[26] A.P. Hillstrom and S. Yantis, “Visual Motion and Attentional Capture,” Perception Psychophysic, vol. 55, pp. 399-411, 1994.
[27] A.B. Watson, “The Cortex Transform: Rapid Computation of Simulated Neural Images,” Computer Vision, Graphics, and Image Processing, vol. 39, pp. 311-327, 1987.
[28] P. Le Callet, A. Saadane, and D. Barba, “Interactions of Chromatic Components on the Perceptual Quantization of the Achromatic Component,” SPIE Human Vision and Electronic Imaging, vol. 3644, 1999.
[29] P. Le Callet, A. Saadane, and D. Barba, “Frequency and Spatial Pooling of Visual Differences for Still Image Quality Assessment,” SPIE Human Vision and Electronic Imaging, vol. 3959, 2000.
[30] P. Le Callet and D. Barba, “Image Quality Assessment: From Sites Errors to a Global Appreciation of Quality,” PCS, 2001.
[31] S. Daly, “A Visual Model for Optimizing the Design of Image Processing Algorithms,” Proc. IEEE Int'l Conf. Image Processing, pp. 16-20, 1994.
[32] P.J. Burt and E.H. Adelson, “The Laplacian Pyramid as a Compact Image Code,” IEEE Trans. Comm. , vol. 31, pp. 532-540, 1983.
[33] H.K. Hartline, “The Response of Single Optic Nerve Fibers of the Vertebrate Eye to Illumination of the Retina,” Am. J. Physiology, vol. 121, pp. 400-415, 1938.
[34] D.S. Wooding, “Eye Movements of Large Population: II. Deriving Regions of Interest, Coverage, and Similarity Using Fixation Maps,” Behavior Research Methods, Instruments and Computers, vol. 34, no. 4, pp. 509-517, 2002.
[35] B. Velichkovsky, M. Pomplum, and J. Rieser, “Attention and Communication: Eye-Movement-Based Research Paradigms,” Visual Attention and Cognition, pp. 125-154, 1996.
[36] S.A. Brandt and L.W Stark, “Spontaneous Eye Movements During Visual Imagery Reflect the Content of the Visual Scene,” J. Cognitive Neuroscience, vol. 9, pp. 27-38, 1997.
[37] C.M. Privitera and L.W. Stark, “Algorithms for Defining Visual Regions-of-Interest: Comparison with Eye Fixation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, pp. 970-982, 2000.
[38] S. Mannan, K.H. Ruddock, and D.S. Wooding, “Fixation Sequences Made during Visual Examination of Briefly Presented 2D Images,” Spatial Vision, vol. 11, pp. 157-178, 1997.
[39] M. Eigen, R. Winkleroswatitsch, and A. Dress, “Statistical Geometry in Sequence Space: A Method of Quantitative Comparative Sequence-Analysis,” Proc. Nat'l Academy of Sciences, vol. 85, pp. 5913-5917, 1988.
[40] J.M. Wolfe and T.S. Horowitz, “What Attributes Guide the Deployment of Visual Attention and How Do They Do It?” Nature Rev. Neuroscience, vol. 5, pp. 1-7, 2004.
[41] O. Le Meur, P. Le Callet, D. Barba, and D. Thoreau, “Performance Assessment of a Visual Attention System Entirely Based on a Human Vision Modeling,” Proc. IEEE Int'l Conf. Image Processing, 2004.
[42] J.H. Elder and R.M. Glodberg, “Ecological Statistics for the Gestalt Laws of Perceptual Organization of Contours,” J. Vision, vol. 2, pp. 323-353, 2002.
[43] T. Hansen and H. Neumann, “A Computational Model of Recurrent, Collinear Long-Range Interaction in V1 for Contour Enhancement and Junction Detection,” Proc. Vision Sciences Soc., Second Ann. Meeting, p. 42, 2002.
[44] L. Itti, “Models of Bottom-Up and Top-Down Visual Attention,” California Inst. of Tech nology, Jan. 2000.

Index Terms:
Computationally modeled human vision, bottom-up visual attention, coherent modeling, eye tracking experiments.
Olivier Le Meur, Patrick Le Callet, Dominique Barba, Dominique Thoreau, "A Coherent Computational Approach to Model Bottom-Up Visual Attention," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 5, pp. 802-817, May 2006, doi:10.1109/TPAMI.2006.86
Usage of this product signifies your acceptance of the Terms of Use.