This Article 
 Bibliographic References 
 Add to: 
An EM Algorithm for the Block Mixture Model
April 2005 (vol. 27 no. 4)
pp. 643-647
Although many clustering procedures aim to construct an optimal partition of objects or, sometimes, of variables, there are other methods, called block clustering methods, which consider simultaneously the two sets and organize the data into homogeneous blocks. Recently, we have proposed a new mixture model called block mixture model which takes into account this situation. This model allows one to embed simultaneous clustering of objects and variables in a mixture approach. We have studied this probabilistic model under the classification likelihood approach and developed a new algorithm for simultaneous partitioning based on the Classification EM algorithm. In this paper, we consider the block clustering problem under the maximum likelihood approach and the goal of our contribution is to estimate the parameters of this model. Unfortunately, the application of the EM algorithm for the block mixture model cannot be made directly; difficulties arise due to the dependence structure in the model and approximations are required. Using a variational approximation, we propose a generalized EM algorithm to estimate the parameters of the block mixture model and, to illustrate our approach, we study the case of binary data by using a Bernoulli block mixture.

[1] H. Bock, “Simultaneous Clustering of Objects and Variables,” Analyse des Données et Informatique, E. Diday, ed. pp. 187-203, INRIA, 1979.
[2] G. Celeux and G. Govaert, “A Classification EM Algorithm for Clustering and Two Stochastic Versions,” Computational Statistics and Data Analysis, vol. 14, no. 3, pp. 315-332, 1992.
[3] Y. Cheng and G.M. Church, “Biclustering of Expression Data,” Proc. Eighth Int'l Conf. Intelligent Systems for Molecular Biology, pp. 93-103, 2000.
[4] A.P. Dempster, N.M. Laird, and D.B. Rubin, “Maximum Likelihood from Incomplete Data via the EM Algorithm (with discussion),” J. Royal Statistical Soc. B, vol. 39, pp. 1-38, 1977.
[5] I.S. Dhillon, “Co-Clustering Documents and Words Using Bipartite Spectral Graph Partitioning,” Proc. Seventh ACM SIGKDD Conf., pp. 269-274, 2001.
[6] G. Govaert, “ Classification Croisée,” Thèse d'état, Université Paris 6, France, 1983.
[7] G. Govaert, “Simultaneous Clustering of Rows and Columns,” Control and Cybernetics, vol. 24, no. 4, pp. 437-458, 1995.
[8] G. Govaert and M. Nadif, “Comparison of the Mixture and the Classification Maximum Likelihood in Cluster Analysis when Data Are Binary,” Computational Statistics and Data Analysis, vol. 23, pp. 65-81, 1996.
[9] G. Govaert and M. Nadif, “Clustering with Block Mixture Models,” Pattern Recognition, vol. 36, pp. 463-473, 2003.
[10] J.A. Hartigan, Clustering Algorithms. New York: Wiley, 1975.
[11] T. Hofmann, J. Puzicha, and J.M. Buhmann, “Unsupervised Texture Segmentation in a Deterministic Annealing Framework,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 8, pp. 803-818, Aug. 1998.
[12] T.S. Jaakkola, “Variational Methods for Inference and Learning in Graphical Models,” PhD thesis, MIT, 1997.
[13] M.I. Jordan, Z. Ghahramani, T.S. Jaakkola, and K.L. Saul, “An Introduction to Variational Methods for Graphical Models,” Learning in Graphical Models, M.I. Jordan, ed., pp. 105-161, Kluwer Academic, 1998.
[14] G.J. McLachlan and D. Peel, Finite Mixture Models. New York: Wiley, 2000.
[15] M.J. Symons, “Clustering Criteria and Multivariate Normal Mixture,” Biometrics, vol. 37, pp. 35-43, 1981.

Index Terms:
Block mixture model, EM algorithm, variational approximation.
G?rard Govaert, Mohamed Nadif, "An EM Algorithm for the Block Mixture Model," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 4, pp. 643-647, April 2005, doi:10.1109/TPAMI.2005.69
Usage of this product signifies your acceptance of the Terms of Use.