loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Hidden Markov Random Field Model Selection Criteria Based on Mean Field-Like Approximations
September 2003 (vol. 25 no. 9)
pp. 1089-1101

Abstract—Hidden Markov random fields appear naturally in problems such as image segmentation, where an unknown class assignment has to be estimated from the observations at each pixel. Choosing the probabilistic model that best accounts for the observations is an important first step for the quality of the subsequent estimation and analysis. A commonly used selection criterion is the Bayesian Information Criterion (BIC) of Schwarz (1978), but for hidden Markov random fields, its exact computation is not tractable due to the dependence structure induced by the Markov model. We propose approximations of BIC based on the mean field principle of statistical physics. The mean field theory provides approximations of Markov random fields by systems of independent variables leading to tractable computations. Using this principle, we first derive a class of criteria by approximating the Markov distribution in the usual BIC expression as a penalized likelihood. We then rewrite BIC in terms of normalizing constants, also called partition functions, instead of Markov distributions. It enables us to use finer mean field approximations and to derive other criteria using optimal lower bounds for the normalizing constants. To illustrate the performance of our partition function-based approximation of BIC as a model selection criterion, we focus on the preliminary issue of choosing the number of classes before the segmentation task. Experiments on simulated and real data point out our criterion as promising: It takes spatial information into account through the Markov model and improves the results obtained with BIC for independent mixture models.

[1] 1089 G. Schwarz, Estimating the Dimension of a Model The Annals of Statistics, vol. 6, pp. 461-464, 1978.[2] M. Akaike, Information Theory and an Extension of the Maximum Likelihood Principle Proc. Second Int'l Symp. Information Theory, B.N. Petrox and F. Caski, eds., pp. 267-281, 1973.[3] J. Rissanen, Stochastic Complexity in Statistical Inquiry. World Scientific Series in Computer Science, vol. 15, 1989.[4] P. Zhang, Model Selection via Multifold Cross Validation The Annals of Statistics, vol. 21, pp. 299-313, 1993.[5] R. Kass and A. Raftery, Bayes Factor J. Am. Statistical Assoc., vol. 90, pp. 733-795, 1995.[6] J.O. Berger and T. Sellke, Testing a Point Null Hypothesis: The Irreconcilability of P-Values and Evidence J. Am. Statistical Assoc., vol. 82, pp. 112-122, 1987.[7] A. Raftery, Bayesian Model Selection in Social Research (with discussion) Sociological Methodology, P.V. Marsden, ed., Cambridge, Mass.: Blackwell, pp. 111-163, 1995.[8] E. Gassiat, Likelihood Ratio Inequalities with Applications to Various Mixtures Technical Report 2001-20, Mathematiques, Orsay, 2001.[9] C. Ji and L. Seymour, A Consistent Model Selection Procedure for Markov Random Fields Based on Penalized Pseudolikelihood Annals of Applied Probability, vol. 6, pp. 423-443, 1996.[10] J. Besag, Statistical Analysis of Non-Lattice Data The Statistician, vol. 24, pp. 179-195, 1975.[11] L. Seymour and C. Ji, Approximate Bayes Model Selection Procedures for Gibbs-Markov Random Fields J. Statistical Planning and Inference, vol. 51, pp. 75-97, 1996.[12] D. Stanford and A. E. Raftery, Determining the Number of Colors or Gray Levels in an Image Using Approximate Bayes Factors: The Pseudolikelihood Information Criterion (PLIC) technical report, Dept. of Statistics, Univ. of Washington,http:/www.stat.washington.edu/, Feb. 2001.[13] W. Qian and D.M. Titterington, Estimation of Parameters in Hidden Markov Models Philosophical Trans. Royal Soc. London A, vol. 337, pp. 407-428, 1991.[14] D. Chandler, Introduction to Modern Statistical Mechanics. Oxford Univ. Press, 1987.[15] G. Celeux, F. Forbes, and N. Peyrard, EM Procedures Using Mean Field-Like Approximations for Markov Model-Based Image Segmentation Pattern Recognition, vol. 36, no. 1, pp. 131-144, 2003.[16] C. Biernacki, G. Celeux, and G. Govaert, Assessing a Mixture Model for Clustering with the Integrated Completed Likelihood IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, pp. 719-725, 2000.[17] J.M. Hammersley and P.E. Clifford, Markov Fields on Finite Graphs and Lattices. 1971.[18] D. Geman, Random Fields and Inverse Problems in Imaging Lecture Notes in Math., vol. 1427, New York: Springer, pp. 113-193, 1991.[19] D. Stanford, Fast Automatic Unsupervised Image Segmentation and Curve Detection in Spatial Point Processes. PhD thesis, Dept. of Statistics, Univ. of Washington, Seattle, 1999.[20] G.J. McLachlan and D. Peel, Finite Mixture Models. Wiley, 2000.[21] C. Fraley and A. Raftery, How Many Clusters? Which Clustering Method? Answers via Model-Based Cluster Analysis Computer J., vol. 41, pp. 578-588, 1998.[22] K. Roeder and L.A. Wasserman, Practical Bayesian Density Estimation Using Mixtures of Normals J. Am. Statistical Assoc., vol. 92, pp. 894-902, 1997.[23] M. Newton and A. Raftery, Approximate Bayesian Inference by the Weighted Likelihood Bootstrap (with discussion) J. Royal Statistical Soc. B, vol. 56, pp. 3-48, 1994.[24] D. Geiger and F. Girosi,“Parallel and deterministic algorithms from MRFs: Surface reconstruction,” IEEE Transactions on PAMI, vol. 13, no. 5, pp. 401-412, May 1991.[25] J. Zerubia and R. Chellappa, Mean Field Approximation Using Compound Gauss-Markov Random Field for Edge Detection and Image Restoration Proc. Int'l Conf. Acoustics, Speech, and Signal Processing, pp. 2193-2196, 1990.[26] A.L. Yuille, Generalized Deformable Models, Statistical Physics and Matching Problems Neural Computation, vol. 2, pp. 1-24, 1990.[27] T.S. Jaakkola and M.I. Jordan, Improving the Mean Field Approximation via the Use of Mixture Distributions Learning in Graphical Models, M.I. Jordan, ed., Dordrencht: Kluwer Academic Publishers, pp. 163-173, 1998.[28] T. Hofmann and M. Buhmann, Pairwise Data Clustering by Deterministic Annealing IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 1, pp. 1-14, Jan. 1997.[29] G.E.B. Archer and D.M. Titterington, Parameter Estimation for Hidden Markov Chains J. Statistical Planning Inference, 2002.[30] J. Besag, On the Statistical Analysis of Dirty Pictures J. Royal Statistical Soc. B, vol. 48, pp. 259-302, 1986.[31] S. Geman and D. Geman, Stochastic Relaxation, Gibbs Distributions and the Bayesian Restoration of Images IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 6, pp. 721-741, 1984.[32] N. Peyrard, Approximations de Type Champ Moyen des Modèles de Champ de Markov pour la Segmentation de Données Spatiales PhD thesis, U.F.R. d'Informatique et de Math. Appliquées, Univ. Joseph Fourier, Grenoble I, France, 2001.[33] Z. Zhou, R. Leahy, and J. Qi, Approximate Maximum Likelihood Hyperparameter Estimation for Gibbs Priors IEEE Trans. Image Processing, vol. 6, no. 6, pp. 844-861, 1997.[34] G. Potaniamos and J. Goutsias, Stochastic Approximation Algorithms for Partition Function Estimation of Gibbs Random Fields IEEE Trans. Information Theory, vol. 43, no. 6, pp. 1948-1965, 1997.[35] G.J. McLachlan and K.E. Basford, Mixture Models: Inference and Applications to Clustering. Dekker, 1987.

Index Terms:
Image segmentation, hidden Markov random fields, model selection, Bayesian Information Criterion, mean field approximation, partition function.
Citation:
Florence Forbes, Nathalie Peyrard, "Hidden Markov Random Field Model Selection Criteria Based on Mean Field-Like Approximations," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 9, pp. 1089-1101, Sept. 2003, doi:10.1109/TPAMI.2003.1227985
Usage of this product signifies your acceptance of the Terms of Use.