This Article 
 Bibliographic References 
 Add to: 
Esaliency (Extended Saliency): Meaningful Attention Using Stochastic Image Modeling
April 2010 (vol. 32 no. 4)
pp. 693-708
Tamar Avraham, Technion—Israeli Institute of Technology, Haifa
Michael Lindenbaum, Technion—Israeli Institute of Technology, Haifa
Computer vision attention processes assign variable-hypothesized importance to different parts of the visual input and direct the allocation of computational resources. This nonuniform allocation might help accelerate the image analysis process. This paper proposes a new bottom-up attention mechanism. Rather than taking the traditional approach, which tries to model human attention, we propose a validated stochastic model to estimate the probability that an image part is of interest. We refer to this probability as saliency and thus specify saliency in a mathematically well-defined sense. The model quantifies several intuitive observations, such as the greater likelihood of correspondence between visually similar image regions and the likelihood that only a few of interesting objects will be present in the scene. The latter observation, which implies that such objects are (relaxed) global exceptions, replaces the traditional preference for local contrast. The algorithm starts with a rough preattentive segmentation and then uses a graphical model approximation to efficiently reveal which segments are more likely to be of interest. Experiments on natural scenes containing a variety of objects demonstrate the proposed method and show its advantages over previous approaches.

[1] , 2009.
[2] groundtruth/, 2009.
[3] 2000 , 2009.
[4] T. Avraham and M. Lindenbaum, "Esaliency—A Stochastic Attention Model Incorporating Similarity Information and Knowledge-Based Preferences," Proc. Int'l Workshop Representation and Use of Prior Knowledge in Vision, with European Conf. Computer Vision, 2006.
[5] T. Avraham and M. Lindenbaum, "Dynamic Visual Search Using Inner Scene Similarity—Algorithms and Bounds," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, no. 2, pp. 151-264, Feb. 2006.
[6] T. Avraham, Y. Yeshurun, and M. Lindenbaum, "Predicting Visual Search Performance by Quantifying Stimuli Similarities," J. Vision, vol. 8, no. 4, pp. 1-22, 2008.
[7] S. Bileschi, "StreetScenes: Towards Scene Understanding in Still Images," PhD thesis, Electrical Eng. and Computer Science Dept., Massachusetts Inst. of Tech nology, May 2006.
[8] O. Boiman and M. Irani, "Detecting Irregularities in Images and Video," Proc. 10th Int'l Conf. Computer Vision, 2005.
[9] N. Bruce and J.K. Tsotsos, "Saliency Based on Information Maximization," Advances in Neural Information Processing Systems, vol. 18, pp. 155-162, MIT Press, 2006.
[10] P.J. Burt, T.H. Hong, and A. Rosenfeld, "Segmentation and Estimation of Image Region Properties through Cooperative Hierarchical Computation," IEEE Trans. Systems, Man, and Cybernetics, vol. 11, no. 12, pp. 802-809, Dec. 1981.
[11] M. Carrasco, D.L. Evert, I. Chang, and S.M. Katz, "The Eccentricity Effect: Target Eccentricity Affects Performance on Conjunction Searches," Perception and Psychophysics, vol. 57, no. 8, pp. 1241-1261, 1995.
[12] C. Carson, S. Belongie, H. Greenspan, and J. Malik, "Blobworld: Image Segmentation Using Expectation-Maximization and Its Applications to Image Querying," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 8, pp. 1026-1038, Aug. 2002.
[13] J. Cheng, R. Greiner, J. Kelly, D. Bell, and W. Liu, "Learning Bayesian Networks from Data: An Information-Theory Based Approach," Artificial Intelligence, vol. 137, pp. 43-90, 2002.
[14] C.K. Chow and C.N. Liu, "Approximating Discrete Probability Distributions with Dependence Trees," IEEE Trans. Information Theory, vol. 14, no. 11, pp. 462-467, Nov. 1968.
[15] A. Cohen and R.B. Ivry, "Density Effects in Conjunction Search: Evidence for Coarse Location Mechanism of Feature Integration," J. Experimental Psychology: Human Perception and Performance, vol. 17, no. 4, pp. 891-901, 1991.
[16] A. Dempster, N. Laird, and D. Rubin, "Maximum Likelihood from Incomplete Data via the EM Algorithm," J. Royal Statistical Soc. Series B, vol. 39, no. 1, pp. 1-38, 1977.
[17] B.A. Draper and A. Lionelle, "Evaluation of Selective Attention under Similarity Transformations," Computer Vision and Image Understanding, vol. 100, nos. 1/2, pp. 152-171, 2005.
[18] J. Duncan, "Selective Attention and the Organization of Visual Information," J. Experimental Psychology: General, vol. 113, pp. 501-517, 1984.
[19] J. Duncan and G.W. Humphreys, "Visual Search and Stimulus Similarity," Psychological Rev., vol. 96, pp. 433-458, 1989.
[20] Learning in Graphical Models, M.I. Jordan, ed. Kluwer Academic, 1998.
[21] C.W. Eriksen and J.D.S. James, "Visual Attention within and around the Field of Focal Attention: A Zoom Lens Model," Perception and Psychophysics, vol. 40, no. 4, pp. 225-240, 1986.
[22] K.H. Fecteau and D.P. Munoz, "Salience, Relevance, and Firing: A Priority Map for Target Selection," Trends in Cognitive Sciences, vol. 10, no. 8, pp. 382-390, 2006.
[23] S. Frintrop, A. Nüchter, and H. Surmann, "Visual Attention for Object Recognition in Spatial 3D Data," Proc. Second Int'l Workshop Attention and Performance in Computational Vision, 2005.
[24] L. Paletta, H. Bischof, G. Fritz, and C. Seifert, "Entropy Based Saliency Maps for Object Recognition," Proc. Early Cognitive Vision Workshop, 2004.
[25] L. Itti and C. Koch, "Feature Combination Strategies for Saliency-Based Visual Attention Systems," J. Electronic Imaging, vol. 10, no. 1, pp. 161-169, 2001.
[26] L. Itti, C. Koch, and E. Niebur, "A Model of Saliency-Based Visual Attention for Rapid Scene Analysis," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 11, pp. 1254-1259, Nov. 1998.
[27] T. Kadir and M. Brady, "Saliency, Scale and Image Description," Int'l J. Computer Vision, vol. 45, no. 2, pp. 83-105, 2001.
[28] C. Koch and S. Ullman, "Shifts in Selective Visual Attention: Towards the Underlying Neural Circuitry," Human Neurobiology, vol. 4, pp. 219-227, 1985.
[29] F. Liu and M. Gleicher, "Region Enhanced Scale-Invariant Saliency Detection," Proc. IEEE Int'l Conf. Multimedia and Expo, 2006.
[30] D. Martin, C. Fowlkes, D. Tal, and J. Malik, "A Database of Human Segmented Natural Images and Its Application to Evaluating Segmentation Algorithms and Measuring Ecological Statistics," Proc. Eighth Int'l Conf. Computer Vision, vol. 2, pp. 416-423, July 2001.
[31] K. Nakayama and G.H. Silverman, "Serial and Parallel Processing Visual Feature Conjunction," Nature, vol. 320, pp. 264-265, 1986.
[32] V. Navalpakkam and L. Itti, "A Goal Oriented Attention Guidance Model," Lecture Notes in Computer Science, pp. 453-461, Springer, 2002.
[33] U. Neisser, Cognitive Psychology. Appleton-Century-Crofts, 1967.
[34] D. Nilsson, "An Efficient Algorithm for Finding the M Most Probable Configurations in Probabilistic Expert Systems," Statistics and Computing, vol. 8, no. 2, pp. 159-173, 1998.
[35] N. Ouerhani, R. von Wartburg, H. Hügli, and R.M. Müri, "Empirical Validation of Saliency-Based Model of Visual Attention," Electronic Letters on Computer Vision and Image Analysis, vol. 3, no. 1, pp. 13-24, 2004.
[36] S. Palmer and I. Rock, "Rethinking Perceptual Organization: The Role of Uniform Connectedness," Psychonomic Bull. and Rev., vol. 1, no. 1, pp. 29-55, 1994.
[37] D. Parkhurst and E. Niebur, "What Could over 1000 Internet Users Tell Us about Visual Attention?" J. Vision, vol. 3, no. 9, 2003.
[38] J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, 1998.
[39] M.I. Posner, C.R.R. Snyder, and B.J. Davidson, "Attention and the Detection of Signals," J. Experimental Psychology: General, vol. 109, no. 2, pp. 160-174, June 1980.
[40] R.C. Prim, "Shortest Connection Networks and Some Generalizations," Bell System Technical J., vol. 36, pp. 1389-1401, 1957.
[41] J. Rissanen, "Modeling by Shortest Data Description," Automatica, vol. 14, pp. 465-471, 1978.
[42] U. Rutishauser, D. Walther, C. Koch, and P. Perona, "Is Bottom-Up Attention Useful for Object Recognition?" Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 37-44, 2004.
[43] C. Schmid and J. Frederic, "Scale-Invariant Shape Features for Recognition of Object Categories," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 90-96, 2004.
[44] B.J. Scholl, "Objects and Attention: The State of the Art," Cognition, vol. 80, pp. 1-46, 2001.
[45] A. Shashua and S. Ullman, "Structural Saliency: The Detection of Globally Salient Structures Using a Locally Connected Network," Proc. Int'l Conf. Computer Vision, pp. 321-327, 1988.
[46] C. Siagian and L. Itti, "Biologically-Inspired Face Detection: Non-Brute-Force-Search Approach," Proc. First IEEE Int'l Workshop Face Processing in Video, June 2004.
[47] P. Spirtes and C. Glymour, "An Algorithm for Fast Recovery of Sparse Causal Graphs," Social Science Computer Rev., vol. 9, no. 1, pp. 62-72, 1991.
[48] A. Srivastava, A.B. Lee, E.P. Simoncelli, and S.C. Zhu, "On Advances in Statistical Modeling of Natural Images," J. Math. Imaging and Vision, vol. 18, pp. 17-33, 2003.
[49] Y. Sun and R. Fisher, "Object-Based Attention for Computer Vision," Artificial Intelligence, vol. 146, pp. 77-123, 2003.
[50] A. Torralba, A. Oliva, M. Castelhano, and J. Henderson, "Contextual Guidance of Eye Movements and Attention in Real-World Scenes: The Role of Global Features on Object Search," Psychological Rev., vol. 113, no. 4, pp. 766-786, 2006.
[51] A. Treisman, "Features and Objects: The 14th Barlett Memorial Lecture," Quarterly J. Experimental Psychology, vol. 40A, pp. 201-237, 1998.
[52] A. Treisman and G. Gelade, "A Feature Integration Theory of Attention," Cognitive Psychology, vol. 12, pp. 97-136, 1980.
[53] J.K. Tsotsos, S.M. Culhane, W.Y.K. Wai, Y. Lai, N. Davis, and F.J. Nuflo, "Modeling Visual Attention via Selective Tuning," Artificial Intelligence, vol. 78, nos. 1/2, pp. 507-545, 1995.
[54] A. Vailaya, M.A.T. Figueiredo, A.K. Jain, and H.J. Zhang, "Image Classification for Content-Based Indexing," IEEE Trans. Image Processing, vol. 10, no. 1, pp. 117-130, Jan. 2001.
[55] P. Viola and M.J. Jones, "Robust Real-Time Face Detection," Int'l J. Computer Vision, vol. 57, no. 2, pp. 137-154, May 2004.
[56] J. Vogel and B. Schiele, "A Semantic Typicality Measure for Natural Scene Categorization," Proc. German Assoc. for Pattern Recognition Symp., pp. 195-203, 2004.
[57] K.N. Walker, T.F. Cootes, and C.J. Taylor, "Correspondence Using Distinct Points Based on Image Invariants," Proc. British Machine Vision Conf., pp. 540-549, 1997.
[58] S. Watson and A. Kramer, "Object-Based Visual Selective Attention and Perceptual Organization," Perception and Psychophysics, vol. 61, pp. 31-49, 1999.
[59] M. Wertheimer, "Untersuchungen Zur Lehre Von Der Gestalt," Psychologishe Forschung, vol. 4, pp. 301-350, 1923.
[60] G. Westheimer, "Visual Acuity," Adler's Physiology of the Eye, Clinical Application, R.A. Moses and W.M. Hart, eds., chapter 17, The C.V. Mosby Company, 1987.
[61] J.M. Wolfe, "Guided Search 2.0: A Revised Model of Visual Search," Psychonomic Bull. and Rev., vol. 1, no. 2, pp. 202-238, 1994.
[62] J.M. Wolfe, "Visual Search," Attention, H. Pashler, ed., Psychology Press, 1998.
[63] A.L. Yarbus, Eye Movements and Vision. Plenum Press, 1967.

Index Terms:
Computer vision, scene analysis, similarity measures, performance evaluation of algorithms and systems, object recognition, visual search, attention.
Tamar Avraham, Michael Lindenbaum, "Esaliency (Extended Saliency): Meaningful Attention Using Stochastic Image Modeling," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 4, pp. 693-708, April 2010, doi:10.1109/TPAMI.2009.53
Usage of this product signifies your acceptance of the Terms of Use.