
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
SongChun Zhu, "Statistical Modeling and Conceptualization of Visual Patterns," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 6, pp. 691712, June, 2003.  
BibTex  x  
@article{ 10.1109/TPAMI.2003.1201820, author = {SongChun Zhu}, title = {Statistical Modeling and Conceptualization of Visual Patterns}, journal ={IEEE Transactions on Pattern Analysis and Machine Intelligence}, volume = {25}, number = {6}, issn = {01628828}, year = {2003}, pages = {691712}, doi = {http://doi.ieeecomputersociety.org/10.1109/TPAMI.2003.1201820}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Pattern Analysis and Machine Intelligence TI  Statistical Modeling and Conceptualization of Visual Patterns IS  6 SN  01628828 SP691 EP712 EPD  691712 A1  SongChun Zhu, PY  2003 KW  Perceptual organization KW  descriptive models KW  generative models KW  causal Markov models KW  discriminative methods KW  minimax entropy learning KW  mixed Markov models. VL  25 JA  IEEE Transactions on Pattern Analysis and Machine Intelligence ER   
Abstract—Natural images contain an overwhelming number of visual patterns generated by diverse stochastic processes. Defining and modeling these patterns is of fundamental importance for generic vision tasks, such as perceptual organization, segmentation, and recognition. The objective of this epistemological paper is to summarize various threads of research in the literature and to pursue a unified framework for conceptualization, modeling, learning, and computing visual patterns. This paper starts with reviewing four research streams: 1) the study of image statistics, 2) the analysis of image components, 3) the grouping of image elements, and 4) the modeling of visual patterns. The models from these research streams are then divided into four categories according to their semantic structures: 1) descriptive models, i.e., Markov random fields (MRF) or Gibbs, 2) variants of descriptive models (causal MRF and “pseudodescriptive” models), 3) generative models, and 4) discriminative models. The objectives, principles, theories, and typical models are reviewed in each category and the relationships between the four types of models are studied. Two central themes emerge from the relationship studies. 1) In representation, the integration of descriptive and generative models is the future direction for statistical modeling and should lead to richer and more advanced classes of vision models. 2) To make visual models computationally tractable, discriminative models are used as computational heuristics for inferring generative models. Thus, the roles of four types of models are clarified. The paper also addresses the issue of conceptualizing visual patterns and their components (vocabularies) from the perspective of statistical mechanics. Under this unified framework, a visual pattern is equalized to a statistical ensemble, and, furthermore, statistical models for various visual patterns form a “continuous” spectrum in the sense that they belong to a series of nested probability families in the space of attributed graphs.
[1] A. Amir and M. Lindenbaum, Ground from Figure Discrimination Computer Vision and Image Understanding, vol. 76, no. 1, pp. 718, 1999.
[2] J.J. Atick and A.N. Redlich, What Does the Retina Know about Natural Scenes? Neural Computation, vol. 4, pp. 196210, 1992.
[3] F. Attneave, Some Informational Aspects of Visual Perception Pschological Rev., vol. 61, pp. 183193, 1954.
[4] L. Alvarez, Y. Gousseau, and J.M. Morel, The Size of Objects in Natural and Artificial Images Advances in Imaging and Electron Physics, J.M. Morel, ed., vol. 111, 1999.
[5] H.B. Barlow, Possible Principles Underlying the Transformation of Sensory Messages Sensory Communication, W.A. Rosenblith, ed. pp. 217234, Cambridge, Mass.: MIT Press, 1961.
[6] J. Besag, Spatial Interaction and the Statistical Analysis of Lattice Systems (with discussion) J. Royal Statistical Soc., B, vol. 36, pp. 192236, 1973.
[7] E. Bienenstock, S. Geman, and D. Potter, Compositionality, MDL Priors, and Object Recognition Proc. Neural Information Processing Systems, 1997.
[8] A. Blake and A. Zisserman, Visual Reconstruction. Cambridge, Mass.: MIT Press, 1987.
[9] K.L. Boyer and S. Sarkar, Perceptual Organization in Computer Vision: Status, Challenges, and Potentials Computer Vision and Image Understanding, vol. 76, no. 1, pp. 15, 1999.
[10] E.J. Candes and D.L. Donoho, Ridgelets: A Key to Higher Dimensional Intermitentcy? Philosophical Trans. Royal Soc. London, A, vol 357, no. 1760, pp. 24952509, 1999.
[11] C.R. Carlson, Thresholds for Perceived Image Sharpness Photographic Science and Eng., vol. 22, pp. 6971, 1978.
[12] D. Chandler, Introduction to Modern Statistical Mechanics. Oxford Univ. Press, 1987.
[13] Z.Y. Chi, Probabilistic Models for Complex Systems doctoral dissertation with S. Geman, Division of Applied Math, Brown Univ., 1998.
[14] C. Chubb and M.S. Landy, Othergonal Distribution Analysis: A New Approach to the Study of Texture Perception Comp. Models of Visual Processing, M.S. Landy, ed. Cambridge, Mass.: MIT Press, 1991.
[15] R.W. Cohen, I. Gorog, and C.R. Carlson, Image Descriptors for Displays Technical Report contract no. N0001474C0184, Office of Navy Research, 1975.
[16] R.R. Coifman and M.V. Wickerhauser, "Entropy Based Algorithms for Best Basis Selection," IEEE Trans. Information Theory, vol. 38, pp. 713718, 1992.
[17] P. Common, Independent Component Analysis A New Concept? Signal Processing, vol. 36, pp. 287314, 1994.
[18] D. Cooper, Maximum Likelihood Estimation of Markov Process Blob Boundaries in Noisy Images IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 1, pp. 372384, 1979.
[19] G.R. Cross and A.K. Jain, Markov Random Field Texture Models IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 5, pp. 2539, 1983.
[20] P. Dayan, G.E. Hinton, R. Neal, and R.S. Zemel, The Helmholtz Machine Neural Computation, vol. 7, pp. 10221037, 1995.
[21] J.S. De Bonet and P. Viola, A NonParametric MultiScale Statistical Model for Natural Images Advances in Neural Information Processing, vol. 10, 1997.
[22] S. Della Pietra, V. Della Pietra, and J. Lafferty, “Inducing Features of Random Fields,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 4, pp. 380393, Apr. 1997.
[23] N.G. Deriugin, The Power Spectrum and the Correlation Function of the Television Signal Telecomm., vol. 1, no. 7, pp. 112, 1957.
[24] S.J. Dickinson, A.P. Pentland, and A. Rosenfeld, From Volumes to Views: An Approach to 3D Object Recognition CVGIP: Image Understanding, vol. 55, no. 2, pp. 130154, Mar. 1992.
[25] D.L. Donoho, M. Vetterli, R.A. DeVore, and I. Daubechie, Data Compression and Harmonic Analysis IEEE Trans. Information Theory, vol. 6, pp. 24352476, 1998.
[26] A. Efros and T. Leung, “Texture Synthesis by NonParametric Sampling,” Proc. Seventh Int'l Conf. Computer Vision, 1999.
[27] A. Efros and W.T. Freeman, Image Quilting for Texture Synthesis and Transfer Proc. SIGGRAPH, 2001.
[28] D.J. Field, Relations between the Statistics and Natural Images and the Responses Properties of Cortical Cells J. Optical Soc. Am. A, vol. 4, pp. 23792394, 1987.
[29] D.J. Field, What Is the Goal of Sensory Coding? Neural Computation, vol 6, pp. 559601, 1994.
[30] B. Frey and N. Jojic, Transformed Component Analysis: Joint Estimation of Spatial Transforms and Image Components Proc. Int'l Conf. Computer Vision, 1999.
[31] K.S. Fu, Syntactic Pattern Recognition. PrenticeHall, 1982.
[32] W.S. Geisler, J.S. Perry, B.J. Super, and D.P. Gallogly, Edge CoOccurence in Natural Images Predicts Contour Grouping Performance Vision Research, vol. 41, pp. 711724, 2001.
[33] S. Geman and D. Geman, Stochastic Relaxation, Gibbs Distributions and the Bayesian Restoration of Images IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 6, pp 721741, 1984.
[34] J.W. Gibbs, Elementary Principles of Statistical Mechanics. Yale Univ. Press, 1902.
[35] J.J. Gibson, The Perception of the Visual World. Boston: Houghton Mifflin, 1966.
[36] U. Grenander, Lectures in Pattern Theory I, II, and III. Springer, 19761981.
[37] U. Grenander, Y. Chow, and K.M. Keenan, Hands: A Pattern Theoretical Study of Biological Shapes. New York: SpringerVerlag, 1991.
[38] U. Grenander and A. Srivastava, “Probability Models for Clutter in Natural Images,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 4, pp. 424429, Apr. 2001.
[39] M.G. Gu and F.H. Kong, A Stochastic Approximation Algorithm with MCMC Method for Incomplete Data Estimation Problems Proc. Nat'l Academy of Sciences, vol. 95, pp 72707274, 1998.
[40] C.E. Guo, S.C. Zhu, and Y.N. Wu, Visual Learning by Integrating Descriptive and Generative Methods Proc. Int'l Conf. Computer Vision, 2001.
[41] G. Guy and G. Medioni, Inferring Global Perceptual Contours from Local Features Int'l J. Computer Vision, vol. 20, pp. 113133, 1996.
[42] D.J. Heeger and J.R. Bergen, PyramidBased Texture Analysis/Synthesis Proc. SIGGRAPH, 1995.
[43] D.W. Jacobs, Recognizing 3D Objects Using 2D Images doctoral dissertation, MIT AI Laboratory, 1993.
[44] E.T. Jaynes, Information Theory and Statistical Mechanics Physical Rev., vol. 106, pp. 620630, 1957.
[45] B. Julesz, Textons, the Elements of Texture Perception and Their Interactions Nature, vol. 290, pp. 9197, 1981.
[46] B. Julesz, Dislogues on Perception. Cambridge, Mass.: MIT Press, 1995.
[47] G. Kanizsa, Organization in Vision. New York: Praeger, 1979.
[48] M. Kass, A. Witkin, and D. Terzopoulos, Snakes: Active Contour Models Proc. Int'l Conf. Computer Vision, 1987.
[49] D. Kersten, Predictability and Redundancy of Natural Images J. Optical Soc. Am. A, vol. 4, no. 12, pp. 23952400, 1987.
[50] S. Konishi, A.L. Yuille, J.M. Coughlan, and S.C. Zhu, Statistical Edge Detection: Learning and Evaluating Edge Cues IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 1, pp. 5774, Jan. 2003.
[51] K. Koffka, Principles of Gestalt Psychology. New York: Harcourt, Brace and Co., 1935.
[52] A. Koloydenko, Modeling Natural Microimage Statistics PhD Thesis, Dept. of Math and Statistics, Univ. of Massachusetts, Amherst, 2000.
[53] A.B. Lee, J.G. Huang, and D.B. Mumford, Random Collage Model for Natural Images Int'l J. Computer Vision, Oct. 2000.
[54] L. Liang, X.W. Liu, Y. Xu, B.N. Guo, and H.Y. Shum, RealTime Texture Synthesis by PatchBased Sampling Technical Report MSRTR200140, Mar. 2001.
[55] C. Liu, S.C. Zhu, and H.Y. Shum, Learning Inhomogeneous Gibbs Model of Face by Minimax Entropy Proc. Int'l Conf. Computer Vision, 2001.
[56] L.D. Lowe, Perceptual Organization and Visual Recognition. Kluwer Academic Publishers, 1985.
[57] S.G. Mallat,“A theory for multiresolution signal decomposition: The wavelet representation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 11, no. 7, pp. 674693, 1989.
[58] K.L. Mengersen and R.L. Tweedie, Rates of Convergence of the Hastings and Metropolis Algorithms Annals of Statistics, vol. 24, pp. 101121, 1994.
[59] Y. Meyer, Principe d'Incertitude, Bases Hilbertiennes et Algebres d'Operateurs Bourbaki Seminar, no. 662, 19851986.
[60] Y. Meyer, Ondelettes et Operateurs. Hermann, 1988.
[61] L. Moisan, A. Desolneux, and J.M. Morel, Meaningful Alignments Int'l J. Computer Vision, vol. 40, no. 1, pp. 723, 2000.
[62] D.B. Mumford, Elastica and Computer Vision Algebraic Geometry and Its Applications, C.L. Bajaj, ed. New York: SpringerVerlag, 1994.
[63] D.B. Mumford and J. Shah, Optimal Approximations of Piecewise Smooth Functions and Associated Variational Problems Comm. Pure and Applied Math., vol. 42, 1989.
[64] D.B. Mumford, Pattern Theory: A Unifying Perspective Proc. First European Congress of Math., 1994.
[65] D.B. Mumford, The Statistical Description of Visual Signals Proc. Third Int'l Congress on Industrial and Applied Math., K. Kirchgassner, O. Mahrenholtz, and R. Mennicken, eds., 1996.
[66] D.B. Mumford and B. Gidas, Stochastic Models for Generic Images Quarterly of Applied Math., vol. LIX, no. 1, pp. 85111, 2001.
[67] B.A. Olshausen and D.J. Field, Sparse Coding with an OverComplete Basis Set: A Strategy Employed by V1? Vision Research, vol. 37, pp. 33113325, 1997.
[68] M.B. Priestley, Spectral Analysis and Time Series. London: Academic Press, 1981.
[69] T. Poggio, V. Torre, and C. Koch, Computational Vision and Regularization Theory Nature, vol. 317, pp. 314319, 1985.
[70] K. Popat and R.W. Picard, Novel ClusterBased Probability Model for Texture Synthesis, Classification, and Compression Proc. SPIE Visual Comm. and Image, pp. 756768, 1993.
[71] J. Portilla and E.P. Simoncelli, A Parametric Texture Model Based on Joint Statistics of Complex Wavelet Coefficients Int'l J. Computer Vision, vol. 40, no. 1, pp. 4971, 2000.
[72] D.L. Ruderman, The Statistics of Natural Images Network: Computation in Neural Systems, vol. 5, pp. 517548, 1994.
[73] D.L. Ruderman, Origins of Scaling in Natural Images Vision Research, vol. 37, pp. 33853398, Dec. 1997.
[74] S. Sarkar and K.L. Boyer, "Integration, Inference, and Management of Spatial Information Using Bayesian Networks: Perceptual Organization," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, no. 3, pp. 256274, Mar. 1993. Special Section on Probabilistic Reasoning.
[75] S. Sarkar and K.L. Boyer, Computing Perceptual Organization in Computer Vision. Singapore: World Scientific, 1994.
[76] C. Shannon, A Mathematical Theory of Communication Bell System Technical J., vol. 27, 1948.
[77] E.P. Simoncelli, W.T. Freeman, E.H. Adelson, and D.J. Heeger, “Shiftable MultiScale Transforms,” IEEE Trans. Information Theory, vol. 38, no. 2, pp. 587607, Mar. 1992.
[78] E.P. Simoncelli and B.A. Olshausen, Natural Image Statistics and Neural Representation Ann. Rev. Neuroscience, vol. 24, pp. 11931216, 2001.
[79] B.J. Smith, Perceptual Organization in a Random Stimulus Human and Machine Vision, A. Rosenfeld, ed. San Diego, Calif.: Academic Press, 1986.
[80] D. Stoyan, W.S. Kendall, and J. Mecke, Stochastic Geometry and Its Applications. John Wiley and Sons, 1987.
[81] D. Terzopoulos, Multilevel Computational Process for Visual Surface Reconstruction Computer Vision, Graphics, and Image Processing, vol. 24, pp. 5296, 1983.
[82] Z.W. Tu and S.C. Zhu, Image Segmentation by DataDriven Markov Chain Monte Carlo IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 5, pp. 657673, May 2002.
[83] Z.W. Tu and S.C. Zhu, Parsing Images into Region and Curve Processes Proc. European Conf. Computer Vision, 2002.
[84] J.H. Van Hateren and D.L. Ruderman, Independent Component Analysis of Natural Image Sequences Yields Spatiotemproal Filters Similar to Simple Cells in Primary Visual Cortex Proc. Royal Soc. London, vol. 265, 1998.
[85] L.R. Williams and D.W. Jacobs, Stochastic Completion Fields: A Neural Model of Illusory Contour Shape and Salience Neural Computation, vol. 9, pp. 837858, 1997.
[86] D.R. Wolf and E.I. George, Maximally Informative Statistics unpublished manuscript, 1999.
[87] Y.N. Wu, S.C. Zhu, and X.W. Liu, “The Equivalence of Julesz and Gibbs Ensembles,” Proc. Int'l Conf. Computer Vision, Sept. 1999.
[88] Y.N. Wu, S.C. Zhu, and C.E. Guo, Statistical Modeling of Image Sketch Proc. European Conf. Computer Vision, 2002.
[89] J.S. Yedidia, W.T. Freeman, and Y. Weiss, Generalized Belief Propagation TR200026, Mitsubishi Electric Research Lab., 2000.
[90] A.L. Yuille, Deformable Templates for Face Recognition J. Cognitive Neuroscience, vol. 3, no. 1, 1991.
[91] A.L. Yuille, J.M. Coughlan, Y.N. Wu, and S.C. Zhu, Order Parameter for Detecting Target Curves in Images: How Does High Level Knowledge Helps? Int'l J. Computer Vision, vol. 41, no. 1/2, pp. 933, 2001.
[92] A.L. Yuille, CCCP Algorithms to Minimize the Bethe and Kikuchi Free Energies: Convergent Alternatives to Belief Propagation Neural Computation, 2001.
[93] S.C. Zhu, Y.N. Wu, and D.B. Mumford, Minimax Entropy Principle and Its Application to Texture Modeling Neural Computation, vol. 9, no. 8, pp. 16271660, Nov. 1997.
[94] S.C. Zhu and D. Mumford, “Prior Learning and Gibbs ReactionDiffusion,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 11, Nov. 1997.
[95] S.C. Zhu, Y.N. Wu, and D.B. Mumford, Filters, Random Fields, and Maximum Entropy (FRAME): Towards a Unified Theory for Texture Modeling Int'l J. Computer Vision, vol. 27, no. 2, pp. 120, 1998.
[96] S. C. Zhu, “Embedding Gestalt Laws in Markov Random Fields,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 21, no. 11, Nov. 1999.
[97] S.C. Zhu et al., "Exploring Texture Ensembles by Efficient Markov Chain Monte Carlo—Toward a 'Trichromacy' Theory of Texture," IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 22, No. 6, 2000, pp. 554569.
[98] S.C. Zhu, R. Zhang, and Z.W. Tu, Integrating TopDown/BottomUp for Object Recognition by DDMCMC Proc. Computer Vision and Pattern Recognition, 2000.
[99] S.C. Zhu, C.E. Guo, Y.N. Wu, and Y.Z. Wang, What Are Textons Proc. European Conf. Computer Vision, 2002.
[100] G.J. Burton and J.R. Moorehead, Color and Spatial Structures in Natural Scenes Applied Optics, vol. 26, no. 1, pp. 157170, 1987.
[101] D.L. Donoho, Wedgelets: Nearly Minmax Estimation of Edges Annals of Statistics, vol. 27, no. 3, pp. 859897, 1999.