The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.06 - June (2009 vol.31)
pp: 989-1005
Dashan Gao , General Electric Global Research, Niskayuna
Sunhyoung Han , University of California, San Diego, La Jolla
Nuno Vasconcelos , University of California, San Diego, La Jolla
ABSTRACT
A discriminant formulation of top-down visual saliency, intrinsically connected to the recognition problem, is proposed. The new formulation is shown to be closely related to a number of classical principles for the organization of perceptual systems, including infomax, inference by detection of suspicious coincidences, classification with minimal uncertainty, and classification with minimum probability of error. The implementation of these principles with computational parsimony, by exploitation of the statistics of natural images, is investigated. It is shown that Barlow's principle of inference by the detection of suspicious coincidences enables computationally efficient saliency measures which are nearly optimal for classification. This principle is adopted for the solution of the two fundamental problems in discriminant saliency, feature selection and saliency detection. The resulting saliency detector is shown to have a number of interesting properties, and act effectively as a focus of attention mechanism for the selection of interest points according to their relevance for visual recognition. Experimental evidence shows that the selected points have good performance with respect to 1) the ability to localize objects embedded in significant amounts of clutter, 2) the ability to capture information relevant for image classification, and 3) the richness of the set of visual attributes that can be considered salient.
INDEX TERMS
visual saliency, interest point detection, coincidence detection, visual recognition, object detection from cluttered scenes, infomax feature selection, saliency measures, natural image statistics
CITATION
Dashan Gao, Sunhyoung Han, Nuno Vasconcelos, "Discriminant Saliency, the Detection of Suspicious Coincidences, and Applications to Visual Recognition", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.31, no. 6, pp. 989-1005, June 2009, doi:10.1109/TPAMI.2009.27
REFERENCES
[1] A. Yarbus, Eye Movements and Vision. Plenum, 1967.
[2] S.E. Palmer, Vision Science: Photons to Phenomenology. The MIT Press, 1999.
[3] C. Koch and S. Ullman, “Shifts in Selective Visual Attention: Towards the Underlying Neural Circuitry,” Human Neurobiology, vol. 4, pp.219-227, 1985.
[4] J.M. Wolfe, “Guided Search 2.0: A Revised Model of Visual Search,” Psychonomic Bull. & Rev., vol. 1, no. 2, pp.202-238, 1994.
[5] L. Itti, C. Koch, and E. Niebur, “A Model of Saliency-Based Visual Attention for Rapid Scene Analysis,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 11, pp.1254-1259, Nov. 1998.
[6] J.K. Tsotsos, S.M. Culhane, W.Y.K. Winky, Y. Lai, N. Davis, and F. Nuflo, “Modeling Visual Attention via Selective Tuning,” Artificial Intelligence, vol. 78, nos.1/2, pp.507-545, 1995.
[7] L. Itti, “Automatic Foveation for Video Compression Using a Neurobiological Model of Visual Attention,” IEEE Trans. Image Processing, vol. 13, pp.1304-1318, 2004.
[8] D. Walther and C. Koch, “Modeling Attention to Salient Proto-Objects,” Neural Networks, vol. 19, pp.1395-1407, 2006.
[9] F. Shic and B. Scassellati, “A Behavioral Analysis of Computational Models of Visual Attention,” Int'l J. Computer Vision, vol. 73, pp.159-177, 2007.
[10] R. Fergus, P. Perona, and A. Zisserman, “Object Class Recognition by Unsupervised Scale-Invariant Learning,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2003.
[11] J. Sivic, B.C. Russell, A.A. Efros, A. Zisserman, and W.T. Freeman, “Discovering Objects and Their Localization in Images,” Proc. IEEE Int'l Conf. Computer Vision, pp.370-377, 2005.
[12] R. Fergus, L. Fei-Fei, P. Perona, and A. Zisserman, “Learning Object Categories from Google's Image Search,” Proc. IEEE Int'l Conf. Computer Vision, pp.1816-1823, 2005.
[13] J. Zhang, M. Marszalek, S. Lazebnik, and C. Schmid, “Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study,” Int'l J. Computer Vision, vol. 73, no. 2, pp.213-238, 2007.
[14] O. Chum and A. Zisserman, “An Exemplar Model for Learning Object Classes,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp.1-8, 2007.
[15] P. Quelhas, F. Monay, J.-M. Odobez, D. Gatica-Perez, and T. Tuytelaars, “A Thousand Words in a Scene,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 9, pp.1575-1589, Sept. 2007.
[16] G. Dorkó and C. Schmid, “Selection of Scale-Invariant Parts for Object Class Recognition,” Proc. Int'l Conf. Computer Vision, vol. 1, pp.634-640, 2003.
[17] A. Opelt, A. Pinz, M. Fussenegger, and P. Auer, “Generic Object Recognition with Boosting,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, no. 3, pp.416-431, Mar. 2006.
[18] C. Harris and M. Stephens, “A Combined Corner and Edge Detector,” Proc. Alvey Vision Conf., pp.147-151, 1988.
[19] W. Förstner, “A Framework for Low Level Feature Extraction,” Proc. European Conf. Computer Vision, pp.383-394, 1994.
[20] A. Sha'ashua and S. Ullman, “Structural Saliency: The Detection of Globally Salient Structures Using a Locally Connected Network,” Proc. IEEE Int'l Conf. Computer Vision, pp.321-327, 1988.
[21] H. Asada and M. Brady, “The Curvature Primal Sketch,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 8, no. 1, pp.2-14, 1986.
[22] D. Reisfeld, H. Wolfson, and Y. Yeshurun, “Context-Free Attentional Operators: The Generalized Symmetry Transform,” Int'l J. Computer Vision, vol. 14, pp.119-130, 1995.
[23] G. Heidemann, “Focus-of-Attention from Local Color Symmetries,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 7, pp.817-830, July 2004.
[24] D.G. Lowe, “Object Recognition from Local Scale-Invariant Features,” Proc. IEEE Int'l Conf. Computer Vision, pp.1150-1157, 1999.
[25] T. Lindeberg, “Scale-Space Theory: A Basic Tool for Analyzing Structures at Different Scales,” J. Applied Statistics, vol. 21, no. 2, pp.224-270, 1994.
[26] K. Mikolajczyk and C. Schmid, “Scale and Affine Invariant Interest Point Detectors,” Int'l J. Computer Vision, vol. 60, no. 1, pp.63-86, 2004.
[27] T. Kadir, A. Zisserman, and M. Brady, “An Affine Invariant Saliency Region Detector,” Proc. European Conf. Computer Vision, pp.228-241, 2004.
[28] K. Yamada and G.W. Cottrell, “A Model of Scan Paths Applied to Face Recognition,” Proc. 17th Ann. Cognitive Science Conf., pp.55-60, 1995.
[29] N. Sebe and M.S. Lew, “Comparing Salient Point Detectors,” Pattern Recognition Letters, vol. 24, nos.1-3, pp.89-96, Jan. 2003.
[30] T. Kadir and M. Brady, “Scale, Saliency and Image Description,” Int'l J. Computer Vision, vol. 45, pp.83-105, Nov. 2001.
[31] S. Lazebnik, C. Schmid, and J. Ponce, “Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2006.
[32] E. Nowak, F. Jurie, and B. Triggs, “Sampling Strategies for Bag-of-Features Image Classification,” Proc. Ninth European Conf. Computer Vision, pp.490-503, 2006.
[33] R. Linsker, “Self-Organization in a Perceptual Network,” Computer, vol. 21, no. 3, pp.105-117, Mar. 1988.
[34] F. Attneave, “Informational Aspects of Visual Perception,” Psychological Rev., vol. 61, pp.183-193, 1954.
[35] H. Barlow, “Redundancy Reduction Revisited,” Network: Computation in Neural Systems, vol. 12, pp.241-253, 2001.
[36] H.B. Barlow, “Cerebral Cortex as a Model Builder,” Models of the Visual Cortex, V.D.D. Rose, ed., pp.37-46, John Wiley Son, 1985.
[37] D. Gao and N. Vasconcelos, “Decision-Theoretic Saliency: Computational Principle, Biological Plausibility, and Implications for Neurophysiology and Psychophysics,” Neural Computation, vol. 21, pp.239-271, 2009.
[38] D. Gao, V. Mahadevan, and N. Vasconcelos, “On the Plausibility of the Discriminant Center-Surround Hypothesis for Visual Saliency,” J. Vision, vol. 8, no. 7, pp.1-18, 2008.
[39] R. Duda, P. Hart, and D. Stork, Pattern Classification. John Wiley & Sons, 2001.
[40] R. Battiti, “Using Mutual Information for Selecting Features in Supervised Neural Net Learning,” IEEE Trans. Neural Networks, vol. 5, no. 4, pp.537-550, July 1994.
[41] N. Vasconcelos, “Feature Selection by Maximum Marginal Diversity,” Proc. Ann. Conf. Neural Information Processing Systems, 2002.
[42] M. Vidal-Naquet and S. Ullman, “Object Recognition with Informative Features and Linear Classification,” Proc. IEEE Int'l Conf. Computer Vision, 2003.
[43] M. Vasconcelos and N. Vasconcelos, “Natural Image Statistics and Low Complexity Feature Selection,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 2, pp. 228-244, Feb. 2009.
[44] T.O. Binford, “Inferring Surfaces from Images,” Artificial Intelligence, vol. 17, nos.1-3, pp.205-244, 1981.
[45] D.G. Lowe, “Three-Dimensional Object Recognition from Single Two-Dimensional Images,” Artificial Intelligence, vol. 31, no. 3, pp.355-395, 1987.
[46] R. Buccigrossi and E. Simoncelli, “Image Compression via Joint Statistical Characterization in the Wavelet Domain,” IEEE Trans. Image Processing, vol. 8, pp.1688-1701, 1999.
[47] J. Huang and D. Mumford, “Statistics of Natural Images and Models,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp.541-547, 1999.
[48] A. Treisman and G. Gelade, “A Feature-Integration Theory of Attention,” Cognitive Psychology, vol. 12, no. 1, pp.97-136, 1980.
[49] M.I. Posner, “Orientation of Attention,” Quarterly J. Experimental Psychology, vol. 32, pp.3-25, 1980.
[50] J.W. Modestino, “Adaptive Nonparametric Detection Techniques,” Nonparametric Methods in Comm., P. Papantoni-Kazakos and D. Kazakos, eds., pp.29-65, Marcel Dekker, 1977.
[51] S.G. Mallat, “A Theory for Multiresolution Signal Decomposition: The Wavelet Representation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 11, no. 7, pp.674-693, July 1989.
[52] K. Sharifi and A. Leon-Garcia, “Estimation of Shape Parameter for Generalized Gaussian Distributions in Subband Decompositions of Video,” IEEE Trans. Circuits and Systems Video Technology, vol. 5, no. 1, pp.52-56, 1995.
[53] M.N. Do and M. Vetterli, “Wavelet-Based Texture Retrieval Using Generalized Gaussian Density and Kullback-Leibler Distance,” IEEE Trans. Image Processing, vol. 11, pp.146-158, 2002.
[54] M. Everingham, A. Zisserman, C.K.I. Williams, and L. Van Gool, “The PASCAL Visual Object Classes Challenge 2006 (VOC2006) Results,” http://journal ofvision.org/8/7/13/,http:/ /www.pascal-network.org/challenges/VOC/ voc2006results.pdf, 2009.
[55] D. Gao and N. Vasconcelos, “An Experimental Comparison of Three Guiding Principles for the Detection of Salient Image Locations: Stability, Complexity, and Discrimination,” Proc. Third Int'l Workshop Attention and Performance in Computational Vision, L.Paletta and E. Rome, eds., pp.184-197, 2007.
[56] A. Torralba, “Contextual Priming for Object Detection,” Int'l J. Computer Vision, vol. 53, no. 2, pp.169-191, 2003.
[57] A. Oliva and A. Torralba, “The Role of Context in Object Recognition,” Trends in Cognitive Sciences, vol. 11, no. 12, pp.520-527, 2007.
[58] M. Vasconcelos, N. Vasconcelos, and G. Carneiro, “Weakly Supervised Top-Down Image Segmentation,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp.1001-1006, 2006.
[59] C. Bouveyron, J. Kannala, C. Schmid, and S. Girard, “Object Localization by Subspace Clustering of Local Descriptors,” Proc. Indian Conf. Vision Graphics and Image Processing, 2006.
[60] F. Jurie and B. Triggs, “Creating Efficient Codebooks for Visual Recognition,” Proc. Int'l Conf. Computer Vision, 2005.
[61] D. Mladenić, J. Brank, M. Grobelnik, and N. Milic-Frayling, “Feature Selection Using Linear Classifier Weights: Interaction with Classification Models,” Proc. ACM SIGIR Conf. Research and Development, pp.234-241, 2004.
[62] T. Hofmann, “Unsupervised Learning by Probabilistic Latent Semantic Analysis,” Machine Learning, vol. 42, pp.177-196, 2001.
[63] P. Brodatz, Textures: A Photographic Album for Artists and Designers. Dover 1966.
[64] N. Vasconcelos and G. Carneiro, “What Is the Role of Independence for Visual Recognition?” Proc. European Conf. Computer Vision, 2002.
[65] V. Ferrari, L. Fevrier, F. Jurie, and C. Schmid, “Groups of Adjacent Contour Segments for Object Detection,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 1, pp.36-51, Jan. 2008.
[66] D. Gao and N. Vasconcelos, “Integrated Learning of Saliency, Complex Features, and Object Detectors from Cluttered Scenes,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp.282-287, 2005.
[67] S. Han and N. Vasconcelos, “Complex Discriminant Features for Object Classification,” Proc. IEEE Int'l Conf. Image Processing, 2008.
6 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool