Issue No. 01 - January (2008 vol. 20)
In many practical data mining applications such as text classification, unlabeled training examples are readily available but labeled ones are fairly expensive to obtain. Therefore, semi-supervised learning algorithms have aroused considerable interests from the data mining and machine learning fields. In recent years, graph based semi-supervised learning has been becoming one of the most active research area in semi-supervised learning community. In this paper, a novel graph based semi-supervised learning approach is proposed based on a linear neighborhood model, which assumes that each data point can be linearly reconstructed from its neighborhood. Our algorithm, named Linear Neighborhood Propagation (LNP), can propagate the labels from the labeled points to the whole dataset using these linear neighborhoods with sufficient smoothness. Theoretical analysis of the properties of LNP are presented in this paper. Furthermore, we also derive an easy way to extend LNP to out-of-sample data. Promising experimental results are presented for synthetic data, digit and text classification tasks.
Data mining, Mining methods and algorithms, Machine learning, Graph labeling
C. Zhang and F. Wang, "Label Propagation through Linear Neighborhoods," in IEEE Transactions on Knowledge & Data Engineering, vol. 20, no. , pp. 55-67, 2007.