2012 IEEE 12th International Conference on Data Mining Workshops
Towards Utility Maximization in Regression
Brussels, Belgium Belgium
December 10-December 10
ISBN: 978-1-4673-5164-5
Utility-based learning is a key technique for addressing many real world data mining applications, where the costs/benefits are not uniform across the domain of the target variable. Still, most of the existing research has been focused on classification problems. In this paper we address a related problem. There are many relevant domains (e.g. ecological, meteorological, finance) where decisions are based on the forecast of a numeric quantity (i.e. the result of a regression model). The goal of the work on this paper is to present an evaluation framework for applications where the numeric outcome of a regression model may lead to different costs/benefits as a consequence of the actions it entails. The new metric provides a more informed estimate of the utility of any regression model, given the application-specific preference biases, and hence makes more reliable the comparison and selection between alternative regression models. We illustrate the objective of our evaluation methodology on a real-life application and also carry a set of experiments over a subset of our target regression tasks: the prediction of rare and extreme values. Results show the effectiveness of our proposed utility metric for identifying the models that perform better on this type of applications.
Standards,Biological system modeling,Context,Accuracy,Predictive models,Equations,Mathematical model,utility-based performance estimate,Cost-sensitive learning,regression
