Publication 2005 Issue No. 6 - June Abstract - Maximum Weighted Likelihood via Rival Penalized EM for Density Mixture Clustering with Automatic Model Selection
Maximum Weighted Likelihood via Rival Penalized EM for Density Mixture Clustering with Automatic Model Selection
June 2005 (vol. 17 no. 6)
pp. 750-761
 ASCII Text x Yiu-ming Cheung, "Maximum Weighted Likelihood via Rival Penalized EM for Density Mixture Clustering with Automatic Model Selection," IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 6, pp. 750-761, June, 2005.
 BibTex x @article{ 10.1109/TKDE.2005.97,author = {Yiu-ming Cheung},title = {Maximum Weighted Likelihood via Rival Penalized EM for Density Mixture Clustering with Automatic Model Selection},journal ={IEEE Transactions on Knowledge and Data Engineering},volume = {17},number = {6},issn = {1041-4347},year = {2005},pages = {750-761},doi = {http://doi.ieeecomputersociety.org/10.1109/TKDE.2005.97},publisher = {IEEE Computer Society},address = {Los Alamitos, CA, USA},}
 RefWorks Procite/RefMan/Endnote x TY - JOURJO - IEEE Transactions on Knowledge and Data EngineeringTI - Maximum Weighted Likelihood via Rival Penalized EM for Density Mixture Clustering with Automatic Model SelectionIS - 6SN - 1041-4347SP750EP761EPD - 750-761A1 - Yiu-ming Cheung, PY - 2005KW - Maximum weighted likelihoodKW - rival penalized Expectation-Maximization algorithmKW - generalized rival penalization controlled competitive learningKW - cluster numberKW - stochastic implementation.VL - 17JA - IEEE Transactions on Knowledge and Data EngineeringER -
Expectation-Maximization (EM) algorithm [10] has been extensively used in density mixture clustering problems, but it is unable to perform model selection automatically. This paper, therefore, proposes to learn the model parameters via maximizing a weighted likelihood. Under a specific weight design, we give out a Rival Penalized Expectation-Maximization (RPEM) algorithm, which makes the components in a density mixture compete each other at each time step. Not only are the associated parameters of the winner updated to adapt to an input, but also all rivals' parameters are penalized with the strength proportional to the corresponding posterior density probabilities. Compared to the EM algorithm [10], the RPEM is able to fade out the redundant densities from a density mixture during the learning process. Hence, it can automatically select an appropriate number of densities in density mixture clustering. We experimentally demonstrate its outstanding performance on Gaussian mixtures and color image segmentation problem. Moreover, a simplified version of RPEM generalizes our recently proposed RPCCL algorithm [8] so that it is applicable to elliptical clusters as well with any input proportion. Compared to the existing heuristic RPCL [25] and its variants, this generalized RPCCL (G-RPCCL) circumvents the difficult preselection of the so-called delearning rate. Additionally, a special setting of the G-RPCCL not only degenerates to RPCL and its Type A variant, but also gives a guidance to choose an appropriate delearning rate for them. Subsequently, we propose a stochastic version of RPCL and its Type A variant, respectively, in which the difficult selection problem of delearning rate has been novelly circumvented. The experiments show the promising results of this stochastic implementation.

[1] S.C. Ahalt, A.K. Krishnamurty, P. Chen, and D.E. Melton, “Competitive Learning Algorithms for Vector Quantization,” Neural Networks, vol. 3, pp. 277-291, 1990.
[2] H. Akaike, “Information Theory and an Extension of the Maximum Likelihood Principle,” Proc. Second Int'l Symp. Information Theory, pp. 267-281, 1973.
[3] H. Akaike, “A New Look at the Statistical Model Identfication,” IEEE Trans. Automatic Control AC-19, pp. 716-723, 1974.
[4] H. Bozdogan, “Model Selection and Akaike's Information Criterion: The General Theory and Its Analytical Extensions,” Psychometrika, vol. 52, no. 3, pp. 345-370, 1987.
[5] H. Bozdogan, “Mixture-Model Cluster Analysis Using Model Selection Criteria and a New Information Measure of Complexity,” Proc. First US/Japan Conf. the Frontiers of Statistical Modeling, vol. 2, pp. 69-113, 1994.
[6] J. Banfield and A. Raftery, “Model-Based Gaussian and Non-Gaussian Clustering,” Biometrics, vol. 49, pp. 803-821, 1993.
[7] B. Fritzke, “The LBG-U Method for Vector Quantization— An Improvement over LBG Inspired From Neural Networks,” Neural Processing Letters, vol. 5, no. 1, pp. 35-45, 1997.
[8] Y.M. Cheung, “Rival Penalization Controlled Competitive Learning for Data Clustering with Unknown Cluster Number,” Proc. Ninth Int'l Conf. Neural Information Processing (Paper ID: 1983 in CD-ROM Proceeding), Nov. 2002.
[9] Y.M. Cheung, “$k^\ast{\hbox{-}}{\rm{Means}}$ — A New Generalized k-Means Clustering Algorithm,” Pattern Recognition Letters, vol. 24, no. 15, pp. 2883-2893, 2003.
[10] A.P. Dempster, N.M. Laird, and D.B. Rubin, “Maximum Likelihood from Incomplete Data via the EM Algorithm,” J. Royal Statistical Soc., Series B, vol. 39, no. 1, pp. 1-38, 1977.
[11] P.A. Devijver and J. Kittler, Pattern Recognition: A Statistical Approach. Prentice-Hall, 1982.
[12] R.O. Dua and P.E. Hart, Pattern Classification and Scene Analysis. Wiley, 1973.
[13] L. Kaufman and P. Rousseeuw, Finding Groups in Data. New York: John Wiley and Sons, 1989.
[14] Y.W. Lim and S.U. Lee, “On the Color Image Segmentation Algorithm Based on the Thresholding and the Fuzzy C-Means Techniques,” Pattern Recognition, vol. 23, no. 9, pp. 935-952, 1990.
[15] Y. Linde, A. Buzo, and R.M. Gray, “An Algorithm for Vector Quantizer Design,” IEEE Trans. Comm., COM-28, pp. 84-95, 1980.
[16] J.B. MacQueen, “Some Methods for Classification and Analysis of Multivariate Observations,” Proc. Fifth Berkeley Symp. Math. Statistics and Probability, vol. 1, pp. 281-297, Berkeley, Calif.: Univ. of California Press, 1967.
[17] G.J. McLachlan and K.E. Basford, Mixture Models: Inference and Application to Clustering. Dekker, 1988.
[18] U. Fayyad, G. Piatetsky-Shpiro, P. Smyth, and R. Uthurusamy, Advances in Knowledge Discovery and Data Mining. MIT Press, 1996.
[19] R.A. Redner and H.F. Waler, “Mixture Densities, Maximum Likelihood, and the EM Algorithm,” SIAM Rev., vol. 26, pp. 195-239, 1984.
[20] G. Schwarz, “Estimating the Dimension of a Model,” The Annals of Statistics, vol. 6, no. 2, pp. 461-464, 1978.
[21] B.W. Silverman, Density Estimation for Statistics and Data Analysis. London: Chapman & Hall, 1986.
[22] T. Uchiyama and M.A. Arib, “Color Image Segmentation Using Competitive Learning,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 16, no. 12, Dec. 1994.
[23] L. Xu, “Bayesian Ying-Yang Machine, Clustering and Number of Clusters,” Pattern Recognition Letters, vol. 18, nos. 11-13, pp. 1167-1178, 1997.
[24] L. Xu, “Rival Penalized Competitive Learning, Finite Mixture, and Multisets Clustering,” Proc. Int'l Joint Conf. Neural Networks, vol. 2, pp. 2525-2530, 1998.
[25] L. Xu, A. Krzyżak, and E. Oja, “Rival Penalized Competitive Learning for Clustering Analysis, RBF Net, and Curve Detection,” IEEE Trans. Neural Networks, vol. 4, pp. 636-648, 1993.

Index Terms:
Maximum weighted likelihood, rival penalized Expectation-Maximization algorithm, generalized rival penalization controlled competitive learning, cluster number, stochastic implementation.
Citation:
Yiu-ming Cheung, "Maximum Weighted Likelihood via Rival Penalized EM for Density Mixture Clustering with Automatic Model Selection," IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 6, pp. 750-761, June 2005, doi:10.1109/TKDE.2005.97