Issue No.11 - November (2007 vol.19)
The traditional setting of supervised learning requires a large amount of labeled training examples in order to achieve good generalization. However, in many practical applications, unlabeled training examples are readily available but labeled ones are fairly expensive to obtain. Therefore, semi-supervised learning has attracted much attention. Previous research on semi-supervised learning mainly focuses on semi-supervised classification. Although regression is almost as important as classification, semi-supervised regression is largely understudied. In particular, although co-training is a main paradigm in semi-supervised learning, few works has been devoted to co-training style semi-supervised regression algorithms. In this paper, a co-training style semi-supervised regression algorithm, i.e. Coreg, is proposed. This algorithm uses two regressors each labels the unlabeled data for the other regressor, where the confidence in labeling an unlabeled example is estimated through the amount of reduction in mean square error over the labeled neighborhood of that example. Analysis and experiments show that Coreg can effectively exploit unlabeled data to improve regression estimates.
Data mining, Machine learning
Zhi-Hua Zhou, Ming Li, "Semisupervised Regression with Cotraining-Style Algorithms", IEEE Transactions on Knowledge & Data Engineering, vol.19, no. 11, pp. 1479-1493, November 2007, doi:10.1109/TKDE.2007.190644