2013 IEEE 13th International Conference on Data Mining Workshops (2012)

Brussels, Belgium Belgium

Dec. 10, 2012 to Dec. 10, 2012

ISBN: 978-1-4673-5164-5

pp: 147-153

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDMW.2012.32

ABSTRACT

Block clustering (or co-clustering or simultaneous clustering) aims at simultaneously partitioning the rows and columns of a data table to reveal homogeneous block structures. This structure can stem from the latent block model which provides a probabilistic modelling of data tables whose block patterns are defined from the row and column classes. For continuous data, each table entry is typically assumed to follow a Gaussian distribution whose parameters are common to all entries belonging to the same block, that is, sharing the same row and column classes. For a given data table, several candidate models are usually examined: they may differ in the numbers of clusters or more generally in the number of free parameters of the model. Model selection then becomes a critical issue, for which the tools that have been derived for model-based one-way clustering need to be adapted. We develop here a criterion based on an approximation of the Integrated Classification Likelihood (ICL) of block models, and propose a BIC-like variant following a similar form. The proposed criteria are assessed on simulated data, where their performances are shown to be fairly reliable for medium to large data tables with well-separated clusters.

INDEX TERMS

Data models, Adaptation models, Approximation methods, Computational modeling, Probabilistic logic, Robustness, Bayesian methods, simulated data, co-clustering, latent block model, model selection, integrated classification likelihood, BIC

CITATION

Aurore Lomet,
Gerard Govaert,
Yves Grandvalet,
"An Approximation of the Integrated Classification Likelihood for the Latent Block Model",

*2013 IEEE 13th International Conference on Data Mining Workshops*, vol. 00, no. , pp. 147-153, 2012, doi:10.1109/ICDMW.2012.32