Seventh IEEE International Conference on Data Mining (ICDM 2007) (2007)
Omaha, Nebraska, USA
Oct. 28, 2007 to Oct. 31, 2007
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDM.2007.40
The requirement that the models resulting from data mining should be understandable is an uncontroversial requirement. In the data mining literature, however, it plays hardly any role, if at all. In practice, though, understandability is often even more important than, e.g., accuracy. Understandability does not mean that models should be simple. It means that one should be able to understand the predictions of models. In this paper we introduce tools to understand arbitrary classifiers defined on discrete data. More in particular, we introduce Explanations that provide insight at a local level. They explain why a classifier classifies a data point as it does. For global insight, we introduce attribute weights. The higher the weight of an attribute, the more often it is decisive in the classification of a data point. To illustrate our tools, we describe a case study in the prediction of small genes. This is a notoriously hard problem in Bioinformatics.
A. Siebes and M. Subianto, "Understanding Discrete Classifiers with a Case Study in Gene Prediction," Seventh IEEE International Conference on Data Mining (ICDM 2007)(ICDM), Omaha, Nebraska, USA, 2007, pp. 661-666.