This Article 
 Bibliographic References 
 Add to: 
Minimum-Entropy Data Partitioning Using Reversible Jump Markov Chain Monte Carlo
August 2001 (vol. 23 no. 8)
pp. 909-914

Abstract—Problems in data analysis often require the unsupervised partitioning of a data set into classes. Several methods exist for such partitioning but many have the weakness of being formulated via strict parametric models (e.g., each class is modeled by a single Gaussian) or being computationally intensive in high-dimensional data spaces. We reconsider the notion of such cluster analysis in information-theoretic terms and show that an efficient partitioning may be given via a minimization of partition entropy. A reversible-jump sampling is introduced to explore the variable-dimension space of partition models.

[1] S. Aeberhard, D. Coomans, and O. de Vel, “Comparative-Analysis of Statistical Pattern-Recognition Methods in High-Dimensional Settings,” Pattern Recognition, vol. 27, no. 8, pp. 1065-1077, 1994.
[2] C. Andrieu, N. de Freitas, and A. Doucet, “Sequential MCMC for Bayesian Model Selection,” Proc. IEEE Signal Processing Workshop Higher Order Statistics, June 1999.
[3] J.M. Bernardo and A.F.M. Smith, Bayesian Theory. John Wiley, 1994.
[4] C.M. Bishop, Neural Networks for Pattern Recognition. Clarendon Press, 1995.
[5] A.P. Dempster, N.M. Laird, and D.B. Rubin, “Maximum Likelihood from Incomplete Data via the EM Algorithm,” J. Royal Statistical Soc., vol. 39, no. 1, pp. 1-38, 1977.
[6] K. Fukunaga, Introduction to Statistical Pattern Recognition, second edition. Academic Press, 1990.
[7] P. Green, “Reversible Jump Markov Chain Monte Carlo Computation and Bayesian Model Determination,” Biometrika, vol. 82, pp. 711-732, 1995.
[8] C. Holmes and B.K. Mallick, “Bayesian Radial Basis Functions of Variable Dimension,” Neural Computation, vol. 10, pp. 1217-1233, 1998.
[9] A.K. Jain and R.C. Dubes, Algorithms for Clustering Data. Englewood Cliffs, N.J.: Prentice Hall, 1988.
[10] R. Neal, Bayesian Learning for Neural Networks. New York: Springer Verlag, 1996.
[11] I.A. Rezek and S.J. Roberts, "Stochastic Complexity Measures for Physiological Signal Analysis," IEEE Trans. Biomedical Eng. vol. 44, no. 9, 1998.
[12] S. Richardson and P.J. Green, “On Bayesian Analysis of Mixtures with an Unknown Number of Components,” J. Royal Statistical Soc. (Series B), vol. 59, no. 4, pp. 731-758, 1997.
[13] S.J. Roberts, “Parametric and Non-Parametric Unsupervised Cluster Analysis,” Pattern Recognition, vol. 30, no. 2, pp. 261-272, 1997.
[14] S.J. Roberts, R. Everson, and I. Rezek, “Maximum Certainty Data Partitioning,” Pattern Recognition, vol. 33, no. 5, pp. 833-839, 2000.
[15] S. Roberts, D. Husmeier, I. Rezek, and W. Penny, Bayesian Approaches to Gaussian Mixture Modeling IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 11, Nov. 1998.
[16] K. Rose, E. Gurewitz, and G.C. Fox, A Deterministic Annealing Approach to Clustering Pattern Recognition Letters, vol. 11, no. 9, pp. 589-594, 1990.
[17] L. Tierney, “Markov Chains for Exploring Posterior Distributions,” Annals of Statistics, vol. 22, pp. 1701-1762, 1994.
[18] R. Wilson and M. Spann, “A New Approach to Clustering,” Pattern Recognition, vol. 23, no. 12, pp. 1413-1425, 1990.

Index Terms:
Unsupervised data analysis, mixture models, Bayesian analysis, reversible-jump Markov Chain Monte Carlo, number of clusters.
Stephen J. Roberts, Chris Holmes, Dave Denison, "Minimum-Entropy Data Partitioning Using Reversible Jump Markov Chain Monte Carlo," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 8, pp. 909-914, Aug. 2001, doi:10.1109/34.946994
Usage of this product signifies your acceptance of the Terms of Use.