Subscribe
Issue No.11 - November (2009 vol.31)
pp: 2000-2014
Pavan Kumar Mallapragada , Michigan State University, East Lansing
Anil K. Jain , Michigan State University, East Lansing
Yi Liu , Michigan State University, East Lansing
ABSTRACT
Semi-supervised learning has attracted a significant amount of attention in pattern recognition and machine learning. Most previous studies have focused on designing special algorithms to effectively exploit the unlabeled data in conjunction with labeled data. Our goal is to improve the classification accuracy of any given supervised learning algorithm by using the available unlabeled examples. We call this as the Semi-supervised improvement problem, to distinguish the proposed approach from the existing approaches. We design a metasemi-supervised learning algorithm that wraps around the underlying supervised algorithm and improves its performance using unlabeled data. This problem is particularly important when we need to train a supervised learning algorithm with a limited number of labeled examples and a multitude of unlabeled examples. We present a boosting framework for semi-supervised learning, termed as SemiBoost. The key advantages of the proposed semi-supervised learning approach are: 1) performance improvement of any supervised learning algorithm with a multitude of unlabeled data, 2) efficient computation by the iterative boosting algorithm, and 3) exploiting both manifold and cluster assumption in training classification models. An empirical study on 16 different data sets and text categorization demonstrates that the proposed framework improves the performance of several commonly used supervised learning algorithms, given a large number of unlabeled examples. We also show that the performance of the proposed algorithm, SemiBoost, is comparable to the state-of-the-art semi-supervised learning algorithms.
INDEX TERMS
Machine learning, semi-supervised learning, semi-supervised improvement, manifold assumption, cluster assumption, boosting.
CITATION
Pavan Kumar Mallapragada, Anil K. Jain, Yi Liu, "SemiBoost: Boosting for Semi-Supervised Learning", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.31, no. 11, pp. 2000-2014, November 2009, doi:10.1109/TPAMI.2008.235
REFERENCES
[1] H.J. Scudder, “Probability of Error of Some Adaptive Pattern-Recognition Machines,” IEEE Trans. Information Theory, vol. 11, no. 3, pp. 363-371, July 1965.
[2] H. Robbins and S. Monro, “A Stochastic Approximation Method,” Annals of Math. Statistics, vol. 22, pp. 400-407, 1951.
[3] X. Zhu and Z. Ghahramani, “Learning from Labeled and Unlabeled Data with Label Propagation,” Technical Report CMU-CALD-02-107, Carnegie Mellon Univ., 2002.
[4] Y. Bengio, O.B. Alleau, and N. Le Roux, “Label Propagation and Quadratic Criterion,” Semi-Supervised Learning, O. Chapelle, B.Schölkopf, and A. Zien, eds., pp. 193-216, MIT Press, 2006.
[5] M. Szummer and T. Jaakkola, “Partially Labeled Classification with Markov Random Walks,” Proc. Neural Information Processing Systems Conf., pp. 945-952, 2001.
[6] A. Blum and S. Chawla, “Learning from Labeled and Unlabeled Data Using Graph Mincuts,” Proc. 18th Int'l Conf. Machine Learning, pp. 19-26, 2001.
[7] T. Joachims, “Transductive Learning via Spectral Graph Partitioning,” Proc. 20th Int'l Conf. Machine Learning, pp. 290-297, 2003.
[8] O. Chapelle and A. Zien, “Semi-Supervised Classification by Low Density Separation,” Proc. 10th Int'l Workshop Artificial Intelligence and Statistics, pp. 57-64, 2005.
[9] Semi-Supervised Learning, O. Chapelle, B. Scholkopf, and A. Zien, eds. MIT Press, 2006.
[10] T. Joachims, “Transductive Inference for Text Classification Using Support Vector Machines,” Proc. 16th Int'l Conf. Machine Learning, pp. 200-209, 1999.
[11] G. Fung and O. Mangasarian, “Semi-Supervised Support Vector Machines for Unlabeled Data Classification,” Optimization Methods and Software, vol. 15, pp. 29-44, 2001.
[12] M. Belkin, P. Niyogi, and V. Sindhwani, “Manifold Regularization: A Geometric Framework for Learning from Examples,” Technical Report TR-2004-06, Dept. of Computer Science, Univ. of Chicago, 2004.
[13] V. Vural, G. Fung, J.G. Dy, and B. Rao, “Semi-Supervised Classifiers Using A-Priori Metric Information,” Optimization Methods and Software J., special issue on machine learning and data mining, to appear.
[14] Y. Freund and R.E. Schapire, “Experiments with a New Boosting Algorithm,” Proc. 13th Int'l Conf. Machine Learning, pp. 148-156, 1996.
[15] C. Rosenberg, M. Hebert, and H. Schneiderman, “Semi-Supervised Self-Training of Object Detection Models,” Proc. Seventh Workshop Applications of Computer Vision, vol. 1, pp. 29-36, Jan. 2005.
[16] K.P. Bennett, A. Demiriz, and R. Maclin, “Exploiting Unlabeled Data in Ensemble Methods,” Proc. Eighth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 289-296, 2002.
[17] F. d'Alche Buc, Y. Grandvalet, and C. Ambroise, “Semi-Supervised Marginboost,” Proc. Neural Information Processing Systems Conf., pp. 553-560, 2002.
[18] X. Zhu, Z. Ghahramani, and J. Lafferty, “Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions,” Proc. 20th Int'l Conf. Machine Learning, pp. 912-919, 2003.
[19] O. Chapelle and A. Zien, “Semi-Supervised Classification by Low Density Separation,” Proc. 10th Int'l Workshop Artificial Intelligence and Statistics, pp. 57-64, 2005.
[20] A. Blum and T. Mitchell, “Combining Labeled and Unlabeled Data with Co-Training,” Proc. Workshop Computational Learning Theory, pp. 92-100, 1998.
[21] D. Miller and H. Uyar, “A Mixture of Experts Classifier with Learning Based on Both Labeled and Unlabeled Data,” Proc. Neural Information Processing Systems Conf., pp. 571-577, 1996.
[22] K. Nigam, A.K. McCallum, S. Thrun, and T. Mitchell, “Text Classification from Labeled and Unlabeled Documents Using EM,” Machine Learning, vol. 39, pp. 103-134, 2000.
[23] N.D. Lawrence and M.I. Jordan, “Semi-Supervised Learning via Gaussian Processes,” Proc. Neural Information Processing Systems Conf., pp. 753-760, 2005.
[24] K. Bennett and A. Demiriz, “Semi-Supervised Support Vector Machines,” Proc. Neural Information Processing Systems Conf., pp.368-374, 1998.
[25] Y. Freund and R.E. Schapire, “A Decision-Theoretic Generalization of Online Learning and an Application to Boosting,” J.Computer and System Sciences, vol. 55, pp. 119-139, Aug. 1997.
[26] D. Zhou, J. Huang, and B. Scholkopf, “Learning from Labeled and Unlabeled Data on a Directed Graph,” Proc. 22nd Int'l Conf. Machine Learning, pp. 1036-1043, 2005.
[27] J. Friedman, T. Hastie, and R. Tibshirani, “Special Invited Paper. Additive Logistic Regression: A Statistical View of Boosting,” The Annals of Statistics, vol. 28, pp. 337-374, Apr. 2000.
[28] P.K. Mallapragada, R. Jin, A.K. Jain, and Y. Liu, “Semiboost: Boosting for Semi-Supervised Learning,” Technical Report MSU-CSE-07-197, Michigan State Univ., 2007.
[29] L. Mason, J. Baxter, P. Bartlett, and M. Frean, “Boosting Algorithms as Gradient Descent in Function Space,” Proc. Neural Information Processing Systems Conf., pp. 512-518, 1999.
[30] T. Minka, “Expectation-Maximization As Lower Bound Maximization,” tutorial, http://www-white.media.mit.edu/tpminka/ papers em.html, 1998.
[31] A. Jain and X. Lu, “Ethnicity Identification from Face Images,” Proc. SPIE, Defense and Security Symp., vol. 5404, pp. 114-123, 2004.
[32] A.K. Jain and F. Farrokhina, “Unsupervised Texture Segmentation Using Gabor Filters,” Pattern Recognition, vol. 24, pp. 1167-1186, 1991.
[33] I.H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques, second ed. Morgan Kaufmann, 2005.
[34] L. Reyzin and R.E. Schapire, “How Boosting the Margin Can Also Boost Classifier Complexity,” Proc. 22nd Int'l Conf. Machine Learning, pp. 753-760, 2006.
[35] J. Platt, N. Cristianini, and J. Shawe, “Large Margin DAGs for Multiclass Classification,” Proc. Neural Information Processing Systems Conf., pp. 547-553, 2000.