loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Spatial Random Tree Grammars for Modeling Hierarchal Structure in Images with Regions of Arbitrary Shape
September 2007 (vol. 29 no. 9)
pp. 1504-1519
We present a novel probabilistic model for the hierarchal structure of an image and its regions. We call this model spatial random tree grammars (SRTGs). We develop algorithms for exact computation of likelihoods and MAP estimates and exact EM updates for model-parameter estimation. We collectively call these algorithms the center-surround algorithm. We use the center-surround algorithm to automatically estimate the ML parameters of SRTGs, classify images based on their likelihood and based on the MAP estimate of the associated hierarchal structure. We apply our method to the task of classifying natural images and demonstrate that the addition of hierarchal structure significantly improves upon the performance of a baseline model that lacks such structure.

[1] J.K. Baker, “Trainable Grammars for Speech Recognition,” Proc. Speech Comm. Papers, 97th Meeting of the Acoustical Soc. Am., pp.547-550, 1979.
[2] L.E. Baum, T. Petrie, G. Soules, and N. Weiss, “A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains,” The Annals of Math. Statistics, vol. 41, no. 1, pp. 164-171, 1970.
[3] J. Besag, “Spatial Interaction and the Statistical Analysis of Lattice Systems,” J. Royal Statistical Soc. B, vol. 36, no. 2, pp. 192-236, 1974.
[4] J. Besag, “Efficiency of Pseudolikelihood Estimation for Simple Gaussian Fields,” Biometrika, vol. 64, no. 3, pp. 616-618, 1977.
[5] A.F. Bobick and Y.A. Ivanov, “Action Recognition Using Probabilistic Parsing,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 196-202, June 1998.
[6] C.A. Bouman and M. Shapiro, “A Multiscale Random Field Model for Bayesian Image Segmentation,” IEEE Trans. Image Processing, vol. 3, no. 2, pp. 162-177, Mar. 1994.
[7] Z. Chi and S. Geman, “Estimation of Probabilistic Context-Free Grammars,” Computational Linguistics, vol. 24, no. 2, pp. 299-305, June 1998.
[8] H. Choi and R.G. Baraniuk, “Multiscale Document Segmentation Using Wavelet-Domain Hidden Markov Models,” Proc. Int'l Soc. Optical Eng./Soc. for Imaging, Science and Technology (SPIE/IS&T) 12th Ann. Int'l Symp.-Electronic Imaging, Jan. 2000.
[9] H. Choi and R.G. Baraniuk, “Multiscale Image Segmentation Using Wavelet-Domain Hidden Markov Models,” IEEE Trans. Image Processing, vol. 10, no. 9, pp. 1309-1321, Sept. 2001.
[10] N. Chomsky, The Logical Structure of Linguistic Theory. Plenum, 1955.
[11] N. Chomsky, Syntactic Structures. Mouton, The Hague, 1957.
[12] N. Chomsky, “On Certain Formal Properties of Grammars,” Information and Control, vol. 2, pp. 137-167, 1959.
[13] D. Comaniciu and P. Meer, “Mean Shift: A Robust Approach toward Feature Space Analysis,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 5, pp. 603-619, May 2002.
[14] I.J. Cox, S.B. Rao, and Y. Zhong, “Ratio Regions: A Technique for Image Segmentation,” Proc. Int'l Conf. Pattern Recognition, pp. 557-564, Aug. 1996.
[15] M.S. Crouse, R.D. Nowak, and R.G. Baraniuk, “Wavelet-Based Statistical Signal Processing Using Hidden Markov Models,” IEEE Trans. Signal Processing, vol. 46, no. 4, pp. 886-902, Apr. 1998.
[16] A.P. Dempster, N.M. Laird, and D.B. Rubin, “Maximum Likelihood from Incomplete Data via the EM Algorithm (with Discussion),” J. Royal Statistical Soc. B, vol. 39, pp. 1-38, 1977.
[17] K.S. Fu, Syntactic Pattern Recognition and Applications. Prentice Hall, 1982.
[18] S. Geman and D. Geman, “Stochastic Relaxation, Gibbs Distributions and the Bayesian Restoration of Images,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 6, pp. 721-741, Nov. 1984.
[19] X. He, R.S. Zemel, and M.Á. Carreira-Perpi nán, “Multiscale Conditional Random Fields for Image Labeling,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2004.
[20] T. Kanungo and S. Mao, “Stochastic Language Models for Style-Directed Layout Analysis of Document Images,” IEEE Trans. Image Processing, vol. 12, no. 5, pp. 583-596, May 2003.
[21] T. Kasami, “An Efficient Recognition and Syntax Algorithm for Context-Free Languages,” Scientific Report AFCRL-65-758, Air Force Cambridge Research Laboratory, Bedford, Mass., 1965.
[22] G.E. Kopec and P.A. Chou, “Document Image Decoding Using Markov Source Models,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 16, no. 6, pp. 602-617, June 1994.
[23] M. Krishnamoorthy, G. Nagy, S.C. Seth, and M. Viswanathan, “Syntactic Segmentation and Labeling of Digitized Pages from Technical Journals,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, no. 7, pp. 737-747, July 1993.
[24] S. Kumar and M. Hebert, “A Hierarchical Field Framework for Unified Context-Based Classification,” Proc. IEEE Int'l Conf. Computer Vision, Oct. 2005.
[25] J. Lafferty, A. McCallum, and F. Pereira, “Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data,” Proc. Int'l Conf. Machine Learning, pp. 282-289, 2001.
[26] K. Lari and S.J. Young, “The Estimation of Stochastic Context-Free Grammars Using the Inside-Outside Algorithm,” Computer Speech and Language, vol. 4, no. 1, pp. 35-56, 1990.
[27] S.E. Levinson, “Continuously Variable Duration Hidden Markov Models for Speech Analysis,” Proc. Int'l Conf. Acoustic and Speech Signal Processing, pp. 1241-1244, 1986.
[28] J. Li and R.M. Gray, “Context-Based Multiscale Classification of Document Images Using Wavelet Coefficient Distributions,” IEEE Trans. Image Processing, vol. 9, no. 9, pp. 1604-1616, Sept. 2000.
[29] J. Li, R.M. Gray, and R.A. Olshen, “Multiresolution Image Classification by Hierarchical Modeling with Two-Dimensional Hidden Markov Models,” IEEE Trans. Information Theory, vol. 46, no. 5, pp. 1826-1841, Aug. 2000.
[30] M.R. Luettgen, W.C. Karl, A.S. Willsky, and R.R. Tenney, “Multiscale Representations of Markov Random Fields,” IEEE Trans. Signal Processing, vol. 41, no. 12, Dec. 1993.
[31] C. Manning and H. Schütze, Foundations of Statistical Natural Language Processing. MIT Press, 1999.
[32] J.W. Modestino and J. Zhang, “A Markov Random Field Model-Based Approach to Image Interpretation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 14, no. 6, pp. 606-615, June 1992.
[33] J.-M. Morel and S. Solimini, Variational Methods in Image Segmentation. Birkhauser, 1995.
[34] G. Nagy and S.C. Seth, “Hierarchical Image Representation with Application to Optically Scanned Documents,” Proc. Int'l Conf. Pattern Recognition, pp. 347-349, July 1984.
[35] J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, 1988.
[36] F. Pereira and Y. Schabes, “Inside-Outside Reestimation from Partially Bracketed Corpora,” Proc. 30th Ann. Meeting Assoc. Computational Linguistics, pp. 128-135, 1992.
[37] I. Pollak, A.S. Willsky, and H. Krim, “Image Segmentation and Edge Enhancement with Stabilized Inverse Diffusion Equations,” IEEE Trans. Image Processing, vol. 9, no. 2, Feb. 2000.
[38] I. Pollak, J.M. Siskind, M.P. Harper, and C.A. Bouman, “Modeling and Estimation of Spatial Random Trees with Application to Image Classification,” Proc. Int'l Conf. Acoustics, Speech, and Signal, Apr. 2003.
[39] I. Pollak, J.M. Siskind, M.P. Harper, and C.A. Bouman, “Parameter Estimation for Spatial Random Trees Using the EM Algorithm,” Proc. Int'l Conf. Image Processing, Sept. 2003.
[40] I. Pollak, J.M. Siskind, M.P. Harper, and C.A. Bouman, “Spatial Random Trees and the Center-Surround Algorithm,” Technical Report TR-ECE-03-03, School of Electrical and Computer Eng., Purdue Univ., Jan. 2003.
[41] G.G. Potamianos and J.K. Goutsias, “Partition Function Estimation of Gibbs Random Filed Images Using Monte Carlo Simulation,” IEEE Trans. Information Theory, vol. 39, no. 4, pp. 1322-1332, July 1993.
[42] D. Potter, “Compositional Pattern Recognition,” PhD dissertation, Brown Univ., 1999, http://www.dam.brown.edu/peopledfp.
[43] L.R. Rabiner, “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition,” Proc. IEEE, vol. 77, no. 2, pp.257-286, 1989.
[44] R.A. Redner and H.F. Walker, “Mixture Densities, Maximum Likelihood and the EM Algorithm,” SIAM Rev., vol. 26, no. 2, pp.195-239, Apr. 1984.
[45] J.K. Romberg, H. Choi, and R.G. Baraniuk, “Bayesian-Tree-Structured Image Modeling Using Wavelet-Domain Hidden Markov Models,” IEEE Trans. Image Processing, vol. 10, no. 7, pp.1056-1068, July 2001.
[46] A. Rosenfeld, Picture Languages. Kluwer Academic Press, 1979.
[47] D. Sankoff, “Branching Processes with Terminal Types: Application to Context-Free Grammars,” J. Applied Probability, vol. 8, pp.233-240, 1971.
[48] A.C. Shaw, “A Formal Picture Description Scheme as a Basis for Picture Processing Systems,” Information and Control, vol. 14, pp. 9-52, 1969.
[49] J. Shi and J. Malik, “Normalized Cuts and Image Segmentation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 888-905, Aug. 2000.
[50] A. Torralba, K.P. Murphy, and W.T. Freeman, “Contextual Models for Object Detection Using Boosted Random Fields,” Proc. 18th Ann. Conf. Neural Information Processing Systems, 2004.
[51] V.N. Vapnik, The Nature of Statistical Learning Theory. Springer, 1995.
[52] A.J. Viterbi, “Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm,” IEEE Trans. Information Theory, vol. 13, pp. 260-267, 1967.
[53] S. Wang and J.M. Siskind, “Image Segmentation with Ratio Cut,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 6, pp. 675-690, June 2003.
[54] W. Wang, I. Pollak, T.-S. Wong, C.A. Bouman, M.P. Harper, and J.M. Siskind, “Hierarchical Stochastic Image Grammars for Classification and Segmentation,” IEEE Trans. Image Processing, vol. 15, no. 10, pp. 3033-3052, Oct. 2006.
[55] C.F.J. Wu, “On the Convergence Properties of the EM Algorithm,” Annals of Statistics, vol. 11, no. 1, pp. 95-103, 1983.
[56] D.H. Younger, “Recognition and Parsing of Context-Free Languages in Time $O(n^{3})$ ,” Information and Control, vol. 10, no. 2, pp.189-208, 1967.

Index Terms:
Bayesian methods for image understanding, multiscale analysis
Citation:
J. M. Siskind, J. Sherman, Jr, I. Pollak, M. P. Harper, C. A. Bouman, "Spatial Random Tree Grammars for Modeling Hierarchal Structure in Images with Regions of Arbitrary Shape," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 9, pp. 1504-1519, June 2007, doi:10.1109/TPAMI.2007.1169
Usage of this product signifies your acceptance of the Terms of Use.