This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A Comparison of Algorithms for Inference and Learning in Probabilistic Graphical Models
September 2005 (vol. 27 no. 9)
pp. 1392-1416
Research into methods for reasoning under uncertainty is currently one of the most exciting areas of artificial intelligence, largely because it has recently become possible to record, store, and process large amounts of data. While impressive achievements have been made in pattern classification problems such as handwritten character recognition, face detection, speaker identification, and prediction of gene function, it is even more exciting that researchers are on the verge of introducing systems that can perform large-scale combinatorial analyses of data, decomposing the data into interacting components. For example, computational methods for automatic scene analysis are now emerging in the computer vision community. These methods decompose an input image into its constituent objects, lighting conditions, motion patterns, etc. Two of the main challenges are finding effective representations and models in specific applications and finding efficient algorithms for inference and learning in these models. In this paper, we advocate the use of graph-based probability models and their associated inference and learning algorithms. We review exact techniques and various approximate, computationally efficient techniques, including iterated conditional modes, the expectation maximization (EM) algorithm, Gibbs sampling, the mean field method, variational techniques, structured variational techniques and the sum-product algorithm ("loopy” belief propagation). We describe how each technique can be applied in a vision model of multiple, occluding objects and contrast the behaviors and performances of the techniques using a unifying cost function, free energy.

[1] E.H. Adelson and P. Anandan, “Ordinal Characteristics of Transparency,” Proc. AAAI Workshop Qualitative Vision, 1990.
[2] O.E. Barndorff-Nielson, Information and Exponential Families. Chichester: Wiley, 1978.
[3] J. Besag, “ On the Statistical Analysis of Dirty Pictures,” J. Royal Statistical Soc. B, vol. 48, pp. 259-302, 1986.
[4] R.G. Cowell, A.P. Dawid, S.L. Lauritzen, and D.J. Spiegelhalter, Probabilistic Networks and Expert Systems. New York: Springer, 1999.
[5] R.G. Cowell, A.P. Dawid, and P. Sebastiani, “A Comparison of Sequential Learning Methods for Incomplete Data,” J. Bayesian Statistics, vol. 5, pp. 581-588, 1996.
[6] A.P. Dempster, N.M. Laird, and D.B. Rubin, “Maximum Likelihood from Incomplete Data via the EM Algorithm,” Proc. Royal Statistical Soc., vol. 39, pp. 1-38, 1977.
[7] Hermann von Helmholtz, D. Cahan, ed. Los Angeles: Univ. of Calif. Press, 1993.
[8] W. Freeman and E. Pasztor, “Learning Low-Level Vision,” Proc. Int'l Conf. Computer Vision, pp. 1182-1189, 1999.
[9] B.J. Frey, “Extending Factor Graphs so as to Unify Directed and Undirected Graphical Models,” Proc. 19th Conf. Uncertainty in Artificial Intelligence, 2003.
[10] B.J. Frey and N. Jojic, “Transformed Component Analysis: Joint Estimation of Spatial Transformations and Image Components,” Proc. IEEE Int'l Conf. Computer Vision, Sept. 1999.
[11] B.J. Frey and N. Jojic, “Transformation-Invariant Clustering Using the EM Algorithm,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 1, pp. 1-17, Jan. 2003.
[12] B.J. Frey, R. Koetter, and N. Petrovic, “Very Loopy Belief Propagation for Unwrapping Phase Images,” Advances in Neural Information Processing Systems 14, T.G. Dietterich, S. Becker, and Z. Ghahramani, eds. MIT Press, 2002.
[13] B.J. Frey and D.J.C. MacKay, “A Revolution: Belief Propagation in Graphs with Cycles,” Advances in Neural Information Processing Systems 1997, M.I. Jordan, M.I. Kearns, and S.A. Solla, eds., vol. 10, pp. 479-485. MIT Press, 1998.
[14] S. Geman and D. Geman, “Stochastic Relaxation, Gibbs Distribution and the Bayesian Restoration of Images,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 6, pp. 721-741, 1984.
[15] Z. Ghahramani and M. Beal, “Propagation algorithms for Variational Bayesian Learning,” Advances in Neural Information Processing Systems 13, T. Leen, T. Dietterich, and V. Tresp, eds. MIT Press, 2001.
[16] D. Heckerman, “A Tutorial on Learning with Bayesian Networks,” Learning in Graphical Models, M.I. Jordan, ed. Norwell, Mass.: Kluwer Academic, 1998.
[17] G.E. Hinton, P. Dayan, B.J. Frey, and R.M. Neal, “The Wake-Sleep Algorithm for Unsupervised Neural Networks,” Science, vol. 268, pp. 1158-1161, 1995.
[18] G.E. Hinton and T.J. Sejnowski, “Learning and Relearning in Boltzmann Machines,” Parallel Distributed Processing: Explorations in the Microstructure of Cognition, D.E. Rumelhart and J.L. McClelland, eds., vol. I, pp. 282-317, 1986.
[19] N. Jojic and B.J. Frey, “Learning Flexible Sprites in Video Layers,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2001.
[20] N. Jojic, B.J. Frey, and A. Kannan, “Epitomic Analysis of Appearance and Shape,” Proc. IEEE Int'l Conf. Computer Vision, Sept. 2003.
[21] N. Jojic, N. Petrovic, B.J. Frey, and T.S. Huang, “Transformed Hidden Markov Models: Estimating Mixture Models of Images and Inferring Spatial Transformations in Video Sequences,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, June 2000.
[22] M.I. Jordan, Z. Ghahramani, T.S. Jaakkola, and L.K. Saul, “An Introduction to Variational Methods for Graphical Models,” Learning in Graphical Models, M.I. Jordan, ed. Norwell, Mass.: Kluwer Academic, 1998.
[23] F.R. Kschischang, B.J. Frey, and H.-A. Loeliger, “Factor Graphs and the Sum-Product Algorithm,” IEEE Trans. Information Theory, special issue on codes on graphs and iterative algorithms, vol. 47, no. 2, pp. 498-519, Feb. 2001.
[24] S.L. Lauritzen, Graphical Models. New York: Oxford Univ. Press, 1996.
[25] D.J.C. MacKay, “Bayesian Neural Networks and Density Networks,” Nuclear Instruments and Methods in Physics Research, vol. 354, pp. 73-80, 1995.
[26] M. Mézard, G. Parisi, and R. Zecchina, “Analytic and Algorithmic Solution of Random Satisfiability Problems,” Science, vol. 297, pp. 812-815, 2002.
[27] T.P. Minka, “Expectation Propagation for Approximate Bayesian Inference,” Proc. 17th Conf. Uncertainty in Artificial Intelligence, 2001.
[28] K.P. Murphy, Y. Weiss, and M.I. Jordan, “Loopy Belief Propagation for Approximate Inference: An Empirical Study,” Proc. 15th Conf. Uncertainty in Artificial Intelligence, 1999.
[29] R.M. Neal, “Bayesian Mixture Modeling by Monte Carlo Simulation,” Technical Report CRG-TR-91-2, Univ. of Toronto, 1991.
[30] R.M. Neal, “Probabilistic Inference Using Markov Chain Monte Carlo Methods,” technical report, Univ. of Toronto, 1993.
[31] R.M. Neal and G.E. Hinton, “A View of the EM Algorithm that Justifies Incremental, Sparse, and Other Variants,” Learning in Graphical Models, M.I. Jordan, ed., pp. 355-368. Norwell, Mass.: Kluwer Academic, 1998.
[32] A.Y. Ng and M.I. Jordan, “A Comparison of Logistic Regression and Naive Bayes,” Advances in Neural Information Processing Systems 14, T.G. Dietterich, S. Becker, and Z. Ghahramani, eds. Cambridge, Mass.: MIT Press, 2002.
[33] J. Pearl, Probabilistic Reasoning in Intelligent Systems. San Mateo, Calif.: Morgan Kaufmann, 1988.
[34] M.J. Wainwright and M.I. Jordan, “Graphical Models, Variational Inference and Exponential Families,” Technical Report 649, Dept. of Statistics, Univ. of California, Berkeley, 2003.
[35] Y. Weiss and W. Freeman, “On the Optimaility of Solutions of the Max-Product Belief Propagation Algorithm in Arbitrary Graphs,” IEEE Trans. Information Theory, special issue on codes on graphs and iterative algorithms, vol. 47, no. 2, pp. 736-744, Feb. 2001.
[36] C.W. Williams and M.K. Titsias, “Learning about Multiple Objects in Images: Factorial Learning without Factorial Search,” Advances in Neural Information Processing Systems 15, S. Becker, S. Thrun, and K. Obermayer, eds. Cambridge, Mass.: MIT Press, 2003.
[37] J. Yedidia, W.T. Freeman, and Y. Weiss, “Understanding Belief Propagation and Its Generalizations,” Proc. Int'l Joint Conf. Artificial Intelligence, 2001.

Index Terms:
Index Terms- Graphical models, Bayesian networks, probability models, probabilistic inference, reasoning, learning, Bayesian methods, variational techniques, sum-product algorithm, loopy belief propagation, EM algorithm, mean field, Gibbs sampling, free energy, Gibbs free energy, Bethe free energy.
Citation:
Brendan J. Frey, Nebojsa Jojic, "A Comparison of Algorithms for Inference and Learning in Probabilistic Graphical Models," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 9, pp. 1392-1416, Sept. 2005, doi:10.1109/TPAMI.2005.169
Usage of this product signifies your acceptance of the Terms of Use.