2012 IEEE 12th International Conference on Data Mining Workshops (2012)
Brussels, Belgium Belgium
Dec. 10, 2012 to Dec. 10, 2012
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDMW.2012.32
Block clustering (or co-clustering or simultaneous clustering) aims at simultaneously partitioning the rows and columns of a data table to reveal homogeneous block structures. This structure can stem from the latent block model which provides a probabilistic modelling of data tables whose block patterns are defined from the row and column classes. For continuous data, each table entry is typically assumed to follow a Gaussian distribution whose parameters are common to all entries belonging to the same block, that is, sharing the same row and column classes. For a given data table, several candidate models are usually examined: they may differ in the numbers of clusters or more generally in the number of free parameters of the model. Model selection then becomes a critical issue, for which the tools that have been derived for model-based one-way clustering need to be adapted. We develop here a criterion based on an approximation of the Integrated Classification Likelihood (ICL) of block models, and propose a BIC-like variant following a similar form. The proposed criteria are assessed on simulated data, where their performances are shown to be fairly reliable for medium to large data tables with well-separated clusters.
Data models, Adaptation models, Approximation methods, Computational modeling, Probabilistic logic, Robustness, Bayesian methods, simulated data, co-clustering, latent block model, model selection, integrated classification likelihood, BIC
A. Lomet, G. Govaert and Y. Grandvalet, "An Approximation of the Integrated Classification Likelihood for the Latent Block Model," 2012 IEEE 12th International Conference on Data Mining Workshops(ICDMW), Brussels, Belgium Belgium, 2012, pp. 147-153.