This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Bayesian Approaches to Gaussian Mixture Modeling
November 1998 (vol. 20 no. 11)
pp. 1133-1142

Abstract—A Bayesian-based methodology is presented which automatically penalizes overcomplex models being fitted to unknown data. We show that, with a Gaussian mixture model, the approach is able to select an "optimal" number of components in the model and so partition data sets. The performance of the Bayesian method is compared to other methods of optimal model selection and found to give good results. The methods are tested on synthetic and real data sets.

[1] E. Anderson, "The Irises of the Gaspe Peninsula," Bull. Am. Iris Soc., vol. 59, pp. 2-5, 1935.
[2] R.A. Baxter and J.J. Oliver, "MDL and MML: Similarities and Differences," Technical Report TR 207, Dept. of Computer Science, Monash Univ., Clayton, Victoria 3168, Austalia, 1994. Available on the WWW fromhttp://www.cs.monash.edu.au~jono.
[3] J.C. Bezdek, Pattern Recognition With Fuzzy Objective Function Algorithms.New York: Plenum Press, 1981.
[4] C.M. Bishop, Neural Networks for Pattern RecognitionOxford, England Oxford University Press, 1995.
[5] J.H. Conway and N.J.A. Sloane, Sphere Packings, Lattices and Groups. London: Springer-Verlag, 1988.
[6] A.P. Dempster, N.M. Laird, and D.B. Rubin, "Maximum Likelihood From Incomplete Data via the EM Algorithm," J. Royal Stat. Soc., vol. 39, no. 1, pp. 1-38, 1977.
[7] I. Gath and A.B. Geva, Unsupervised Optimal Fuzzy Clustering IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 11, pp. 773-781, 1989.
[8] P. Hall and D.M. Titterington, "The Use of Uncategorized Data to Improve the Performance of a Nonparametric Estimator of a Mixture Density," J. Royal Statistical Soc.—Series B, vol. 47, pp. 155-163, 1985.
[9] D.J. Hand, Kernel Discriminant Analysis Research Studies Press, 1994.
[10] A.C. Harvey, The Econometric Analysis and Time Series.Oxford, England: Philip Allan, 1981.
[11] D. Husmeier, "Modelling Conditional Probability Densities With Neural Networks," PhD Thesis, Dept. of Mathematics, King's College, Univ. of London, 1997.
[12] H. Jeffreys, Theory of Probability.Oxford, England: Oxford Univ. Press, 1939.
[13] P.M. Lee, Bayesian Statistics: An Introduction. Edward Ar nold, 1994.
[14] D.J.C. MacKay, "A Practical Bayesian Framework for Backpropagation Networks," Neural Computation, vol. 4, pp. 448-472, 1992.
[15] D.J.C. MacKay, "The Evidence Framework Applied to Classificaiton Networks," Neural Computation, vol. 4, pp. 720-736, 1992.
[16] J.J.K. O'Ruanaidth and W.J. Fitzgerals, Numerical Bayesian Methods Applied to Signal Processing. Springer, 1996.
[17] J.J. Oliver and R.A. Baxter, "MML and Bayesianism: Similarities and Differences," Technical Report TR 206, Dept. of Computer Science, Monash Univ. Clayton, Victoria 3168, Australia, 1994. Available on the WWW fromhttp: //www.cs.monash.edu.au~jono.
[18] J.J. Oliver, R.A. Baxter, and C.S. Wallace, "Unsupervised Learning Using MML," Proc. 13th Int'l Conf. Machine Learning (ICML'96), pp. 364-372, San Francisco, 1996. Available on the WWW fromhttp://www.cs.monash.edu.au~jono.
[19] I.A. Rezek and S.J. Roberts, "Stochastic Complexity Measures for Physiological Signal Analysis," IEEE Trans. Biomedical Eng. vol. 44, no. 9, 1998.
[20] S. Richardson and P.J. Green, "On Bayesian Analysis of Mixtures With an Unknown Number of Components," J. Royal Statistical Soc.—Series B, vol. 59, no. 4, pp. 731-758, 1997.
[21] B. Ripley, Pattern Recognition and Neural Networks. Cambridge Univ. Press, 1996.
[22] J. Rissanen, "Modelling by Shortest Data Description," Automatica, vol. 14, pp. 465-471, 1978.
[23] S.J. Roberts, "Parametric and Non-Parametric Unsupervised Cluster Analysis," Pattern Recognition, vol. 30, no. 2, pp. 261-272, 1997.
[24] B.W. Silverman, "Density Estimation for Statistics and Data Analysis," Monographs on Statistics and Applied Probability, no. 26. London: Chapman and Hall, 1986.
[25] D.M. Titterington, A.F.M. Smith, and U.E. Makov, Statistical Analysis and Finite Mixture Distributions. John Wiley, 1985.
[26] C.S. Wallace and P.R. Freeman, "Estimation and Inference by Compact Coding," J. Royal Statistical Soc.—Series B, vol. 49, pp. 240-252, 1987.
[27] M.P. Wand and M.C. Jones, "Kernel Smoothing," Monographs on Statistics and Applied Probability.London: Chapman and Hall, 1995.

Index Terms:
Cluster analysis, unsupervised learning, Bayesian methods, Gaussian mixture models.
Citation:
Stephen J. Roberts, Dirk Husmeier, Iead Rezek, William Penny, "Bayesian Approaches to Gaussian Mixture Modeling," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, no. 11, pp. 1133-1142, Nov. 1998, doi:10.1109/34.730550
Usage of this product signifies your acceptance of the Terms of Use.