The Community for Technology Leaders
RSS Icon
Issue No.05 - May (2011 vol.23)
pp: 788-799
Jun Du , The University of Western Ontario, Ontario
Charles X. Ling , The University of Western Ontario, Ontario
Zhi-Hua Zhou , Nanjing University, Nanjing
Cotraining, a paradigm of semisupervised learning, is promised to alleviate effectively the shortage of labeled examples in supervised learning. The standard two-view cotraining requires the data set to be described by two views of features, and previous studies have shown that cotraining works well if the two views satisfy the sufficiency and independence assumptions. In practice, however, these two assumptions are often not known or ensured (even when the two views are given). More commonly, most supervised data sets are described by one set of attributes (one view). Thus, they need be split into two views in order to apply the standard two-view cotraining. In this paper, we first propose a novel approach to empirically verify the two assumptions of cotraining given two views. Then, we design several methods to split single view data sets into two views, in order to make cotraining work reliably well. Our empirical results show that, given a whole or a large labeled training set, our view verification and splitting methods are quite effective. Unfortunately, cotraining is called for precisely when the labeled training set is small. However, given small labeled training sets, we show that the two cotraining assumptions are difficult to verify, and view splitting is unreliable. Our conclusions for cotraining's effectiveness are mixed. If two views are given, and known to satisfy the two assumptions, cotraining works well. Otherwise, based on small labeled training sets, verifying the assumptions or splitting single view into two views are unreliable; thus, it is uncertain whether the standard cotraining would work or not.
Semisupervised learning, cotraining, sufficiency assumption, independence assumption, view splitting, single-view.
Jun Du, Charles X. Ling, Zhi-Hua Zhou, "When Does Cotraining Work in Real Data?", IEEE Transactions on Knowledge & Data Engineering, vol.23, no. 5, pp. 788-799, May 2011, doi:10.1109/TKDE.2010.158
[1] S. Abney, "Bootstrapping," Proc. 40th Ann. Meeting of the Assoc. for Computational Linguistics, pp. 360-367, 2002.
[2] A. Asuncion and D.J. Newman, "UCI Machine Learning Repository," 2007.
[3] M.F. Balcan, A. Blum, and K. Yang, "Co-Training and Expansion: Towards Bridging Theory and Practice," Advances in Neural Information Processing Systems, vol. 17, pp. 89-96, MIT Press., 2005.
[4] M. Belkin and P. Niyogi, "Semi-Supervised Learning on Riemannian Manifolds," Machine Learning, vol. 56, pp. 209-239, 2004.
[5] A. Blum and S. Chawla, "Learning from Labeled and Unlabeled Data Using Graph Mincuts," Proc. 18th Int'l Conf. Machine Learning, pp. 19-26, 2001.
[6] A. Blum and T. Mitchell, "Combining Labeled and Unlabeled Data with Co-Training," Proc. 11th Ann. Conf. Computational Learning Theory, pp. 92-100, 1998.
[7] Semi-Supervised Learning, O. Chapelle, B. Schölkopf, and A. Zien eds., MIT Press, 2006.
[8] O. Chapelle, J. Weston, and B. Schölkopf, "Cluster Kernels for Semi-Supervised Learning," Advances in Neural Information Processing Systems, vol. 15, pp. 585-592, MIT Press, 2003.
[9] S. Dasgupta, M. Littman, and D. McAllester, "PAC Generalization Bounds for Co-Training," Advances in Neural Information Processing Systems, vol. 14, pp. 375-382, MIT Press, 2002.
[10] A. Fujino, N. Ueda, and K. Saito, "Semisupervised Learning for a Hybrid Generative/Discriminative Classifier Based on the Maximum Entropy Principle," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 3, pp. 424-437, Mar. 2008.
[11] S. Goldman and Y. Zhou, "Enhancing Supervised Learning with Unlabeled Data," Proc. 17th Int'l Conf. Machine Learning, pp. 327-334, 2000.
[12] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I.H. Witten, "The Weka Data Mining Software: An Update," SIGKDD Explorations, vol. 11, pp. 10-18, 2009.
[13] R. Hwa, M. Osborne, A. Sarkar, and M. Steedman, "Corrected Co-Training for Statistical Parsers," Proc. Int'l Conf. Machine Learning Workshop Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining (ICML '03), pp. 95-102, 2003.
[14] P. Langley, W. Iba, and K. Thompson, "An Analysis of Bayesian Classifiers," Proc. 10th Nat'l Conf. Artificial Intelligence, pp. 223-228, 1992.
[15] C.X. Ling, J. Du, and Z.H. Zhou, "When Does Co-Training Work in Real Data?," Proc. 13th Pacific-Asia Conf. Advances in Knowledge Discovery and Data Mining (PAKDD '09), pp. 596-603, 2009.
[16] D.J. Miller and H.S. Uyar, "A Mixture of Experts Classifier with Learning Based on Both Labelled and Unlabelled Data," Advances in Neural Information Processing Systems, vol. 9, pp. 571-577, MIT Press, 1997.
[17] K. Nigam and R. Ghani, "Analyzing the Effectiveness and Applicability of Co-Training," Proc. Ninth ACM Int'l Conf. Information and Knowledge Management, pp. 86-93, 2000.
[18] K. Nigam, A.K. Mccallum, S. Thrun, and T. Mitchell, "Text Classification from Labeled and Unlabeled Documents Using EM," Machine Learning, vol. 39, pp. 103-134, 2000.
[19] D. Pierce and C. Cardie, "Limitations of Co-Training for Natural Language Learning from Large Data Sets," Proc. 2001 Conf. Empirical Methods in Natural Language Processing, pp. 1-9, 2001.
[20] R.J. Quinlan, C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., 1993.
[21] A. Sarkar, "Applying Co-Training Methods to Statistical Parsing," Proc. Second Ann. Meeting of the North Am. Chapter of the Assoc. for Computational Linguistics, pp. 95-102, 2001.
[22] M. Steedman, M. Osborne, A. Sarkar, S. Clark, R. Hwa, J. Hockenmaier, P. Ruhlen, S. Baker, and J. Crim, "Bootstrapping Statistical Parsers from Small Data Sets," Proc. 11th Conf. European Chapter of the Assoc. for Computational Linguistics, pp. 331-338, 2003.
[23] W. Wang and Z.H. Zhou, "Analyzing Co-Training Style Algorithms," Proc. 18th European Conf. Machine Learning, pp. 454-465, 2007.
[24] D. Zhou, B. Schölkopf, and T. Hofmann, "Semi-Supervised Learning on Directed Graphs," Advances in Neural Information Processing Systems, vol. 17, pp. 1633-1640, MIT Press, 2005.
[25] Z.H. Zhou, K.J. Chen, and H.B. Dai, "Enhancing Relevance Feedback in Image Retrieval Using Unlabeled Data," ACM Trans. Information Systems, vol. 24, pp. 219-244, 2006.
[26] Z.H. Zhou, K.J. Chen, and Y. Jiang, "Exploiting Unlabeled Data in Content-Based Image Retrieval," Proc. 15th European Conf. Machine Learning, pp. 525-536, 2004.
[27] Z.H. Zhou and M. Li, "Tri-Training: Exploiting Unlabeled Data Using Three Classifiers," IEEE Trans. Knowledge and Data Eng., vol. 17, no. 11, pp. 1529-1541, Nov. 2005.
[28] Z.-H. Zhou and M. Li, "Semi-Supervised Learning by Disagreement," Knowledge and Information Systems, vol. 24, no. 3, pp. 415-439, 2010.
[29] X. Zhu, Semi-Supervised Learning Literature Survey, Technical Report 1530, Dept. of Computer Sciences, Univ. of Wisconsin at Madison, 2006.
[30] X. Zhu, Z. Ghahramani, and J. Lafferty, "Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions," The 20th Int'l Conf. Machine Learning, pp. 912-919, Aug. 2003.
29 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool