This Article 
 Bibliographic References 
 Add to: 
Mode-Finding for Mixtures of Gaussian Distributions
November 2000 (vol. 22 no. 11)
pp. 1318-1323

Abstract—Gradient-quadratic and fixed-point iteration algorithms and appropriate values for their control parameters are derived for finding all modes of a Gaussian mixture, a problem with applications in clustering and regression. The significance of the modes found is quantified locally by Hessian-based error bars and globally by the entropy as sparseness measure.

[1] D.M. Titterington, A.F.M. Smith, and U.E. Makov, Statistical Analysis of Finite Mixture Distributions. New York, London, Sydney: John Wiley&Sons, 1985.
[2] D.W. Scott, Multivariate Density Estimation. Theory, Practice, and Visualization. New York, London, Sydney: John Wiley&Sons, 1992.
[3] C.M. Bishop, M. Svensén, and C.K.I. Williams, “GTM: The Generative Topographic Mapping,” Neural Computation, vol. 10, no. 1, pp. 215-235, 1998.
[4] C.M. Bishop, Neural Networks for Pattern Recognition. Clarendon Press, 1995.
[5] S. Roberts, D. Husmeier, I. Rezek, and W. Penny, Bayesian Approaches to Gaussian Mixture Modeling IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 11, Nov. 1998.
[6] M.E. Tipping and C.M. Bishop, “Mixtures of Probabilistic Principal Component Analysers,” Neural Computation, vol. 11, no. 2, pp. 443-482, 1999.
[7] G.E. Hinton, P. Dayan, and M. Revow, Modeling the Manifolds of Images of Handwritten Digits IEEE Trans. Neural Networks, vol. 8, no. 1, pp. 65-74, Jan. 1997.
[8] V.N. Vapnik and S. Mukherjee, “Support Vector Method for Multivariate Density Estimation,” Advances in Neural Information Processing, S.A. Solla, T.K. Leen, and K.-R. Müller, eds., vol. 12, pp. 659-665, Cambridge, Mass.: MIT Press, 2000.
[9] R.A. Jacobs, M.I. Jordan, S.J. Nowlan, and G.E. Hinton, “Adaptive Mixtures of Local Experts,” Neural Computation, vol. 3, no. 1, pp. 79-87, 1991.
[10] L.R. Rabiner and B.H. Juang, Fundamentals of Speech Recognition, Prentice Hall, Upper Saddle River, N.J., 1993.
[11] C. Genest and J.V. Zidek, “Combining Probability Distributions: A Critique and an Annotated Bibliography,” Statistical Science, vol. 1, pp. 114-135 (with discussion pp. 135-148 ), Feb. 1986.
[12] G.E. Hinton, “Products of Experts,” Proc. Ninth Int'l Conf. Artificial Neural Networks (ICANN99), pp. 1-6, Sept. 1999.
[13] B. Moghaddam and A. Pentland, “Probabilistic Visual Learning for Object Representation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 696-710, July 1997.
[14] M. Isard and A. Blake, “Condensation-Conditional Density Propagation for Visual Tracking,” Int'l J. Computer Vision, vol. 29, pp. 5-28, 1998.
[15] M.Á. Carreira-Perpiñán, “Reconstruction of Sequential Data with Probabilistic Models and Continuity Constraints,” Advances in Neural Information Processing, S.A. Solla, T.K. Leen, and K.-R. Müller, eds., vol. 12, pp. 414-420, Cambridge, Mass.: MIT Press, 2000.
[16] A. Gelman, J.B. Carlin, H.S. Stern, and D.B. Rubin, Bayesian Data Analysis. London, New York: Chapman&Hall, 1995.
[17] M.Á. Carreira-Perpiñán, “Mode-Finding for Mixtures of Gaussian Distributions,” Technical Report CS-99-03, Dept. of Computer Science, Univ. of Sheffield, U.K., Mar. 1999 (revised Aug. 2000 ). Available online at / cs-99-03.html.
[18] R. Wilson and M. Spann, “A New Approach to Clustering,” Pattern Recognition, vol. 23, no. 12, pp. 1413-1425, 1990.
[19] W.H. Press, S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery, Numerical Recipes in C. Cambridge Univ. Press, 1992.
[20] E. Isaacson and H.B. Keller, Analysis of Numerical Methods. New York, London, Sydney: John Wiley&Sons, 1966.
[21] A.P. Dempster, N.M. Laird, and D.B. Rubin, “Maximum Likelihood from Incomplete Data via the EM Algorithm,” J. Royal Statistical Soc., B, vol. 39, no. 1, pp. 1-38, 1977.
[22] K. Rose, “Deterministic Annealing for Clustering, Compression, Classification, Regression and Related Optimization Problems,” Proc. IEEE, vol. 86, pp. 2,210-2,239, 1998.
[23] B. Scholkopf et al., Input Space versus Feature Space in Kernel-Based Methods IEEE Trans. Neural Networks, vol. 10, no. 5, pp. 1000-1017, Sept. 1999.
[24] A. Pisani, “A Nonparametric and Scale-Independent Method for Cluster-Analysis.1.the Univariate Case,” Monthly Notices Royal Astronomical Soc., vol. 265, pp. 706-726, Dec. 1993.
[25] J.H. Friedman and N.I. Fisher, “Bump Hunting in High-Dimensional Data,” Statistics and Computing, vol. 9, pp. 123-143 (with discussion pp. 143-162 ), Apr. 1999.
[26] S.J. Roberts, “Parametric and Non-Parametric Unsupervised Cluster Analysis,” Pattern Recognition, vol. 30, pp. 261-272, Feb. 1997.
[27] R.D. Zhang and J.-G. Postaire, “Convexity Dependent Morphological Transformations for Mode Detection in Cluster-Analysis,” Pattern Recognition, vol. 27, no. 1, pp. 135-148, 1994.

Index Terms:
Gaussian mixtures, maximization algorithms, mode finding, bump finding, error bars, sparseness.
Miguel Á. Carreira-Perpiñán, "Mode-Finding for Mixtures of Gaussian Distributions," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 11, pp. 1318-1323, Nov. 2000, doi:10.1109/34.888716
Usage of this product signifies your acceptance of the Terms of Use.