Subscribe

Issue No.11 - Nov. (2012 vol.24)

pp: 2040-2051

Guangxia Li , Nanyang Technological University, Singapore

Kuiyu Chang , Nanyang Technological University, Singapore

Steven C.H. Hoi , Nanyang Technological University, Singapore

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2011.160

ABSTRACT

Obtaining high-quality and up-to-date labeled data can be difficult in many real-world machine learning applications. Semi-supervised learning aims to improve the performance of a classifier trained with limited number of labeled data by utilizing the unlabeled ones. This paper demonstrates a way to improve the transductive SVM, which is an existing semi-supervised learning algorithm, by employing a multiview learning paradigm. Multiview learning is based on the fact that for some problems, there may exist multiple perspectives, so called views, of each data sample. For example, in text classification, the typical view contains a large number of raw content features such as term frequency, while a second view may contain a small but highly informative number of domain specific features. We propose a novel two-view transductive SVM that takes advantage of both the abundant amount of unlabeled data and their multiple representations to improve classification result. The idea is straightforward: train a classifier on each of the two views of both labeled and unlabeled data, and impose a global constraint requiring each classifier to assign the same class label to each labeled and unlabeled sample. We also incorporate manifold regularization, a kind of graph-based semi-supervised learning method into our framework. The proposed two-view transductive SVM was evaluated on both synthetic and real-life data sets. Experimental results show that our algorithm performs up to 10 percent better than a single-view learning approach, especially when the amount of labeled data is small. The other advantage of our two-view semi-supervised learning approach is its significantly improved stability, which is especially useful when dealing with noisy data in real-world applications.

INDEX TERMS

Support vector machines, Optimization, Manifolds, Training, Fasteners, Laplace equations, Supervised learning, support vector machines, Artificial intelligence, learning systems, semi-supervised learning, multiview learning

CITATION

Guangxia Li, Kuiyu Chang, Steven C.H. Hoi, "Multiview Semi-Supervised Learning with Consensus",

*IEEE Transactions on Knowledge & Data Engineering*, vol.24, no. 11, pp. 2040-2051, Nov. 2012, doi:10.1109/TKDE.2011.160REFERENCES

- [1] G. Li, S.C.H. Hoi, and K. Chang, "Two-View Transductive Support Vector Machines,"
Proc. 10th SIAM Int'l Conf. Data Mining (SDM '10), pp. 235-244, 2010.- [2] V.N. Vapnik,
Statistical Learning Theory. Wiley-Interscience, 1998.- [3] O. Chapelle and A. Zien, "Semi-Supervised Classification by Low Density Separation,"
Proc. 10th Int'l Workshop Artificial Intelligence and Statistics (AISTATS '05), pp. 57-64, 2005.- [4] J.D.R. Farquhar, D.R. Hardoon, H. Meng, J. Shawe-Taylor, and S. Szedmák, "Two View Learning: Svm-2k, Theory and Practice,"
Proc. Advances in Neural Information Processing Systems (NIPS), 2005.- [5] M. Belkin, P. Niyogi, and V. Sindhwani, "On Manifold Regularization,"
Proc. 10th Int'l Workshop Artificial Intelligence and Statistics (AISTAT '05), pp. 17-24, 2005.- [6] V. Sindhwani, P. Niyogi, and M. Belkin, "Beyond the Point Cloud: From Transductive to Semi-Supervised Learning,"
Proc. 22nd Int'l Conf. Machine Learning (ICML '05), pp. 824-831, 2005.- [7] K. Dave, S. Lawrence, and D.M. Pennock, "Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews,"
Proc. 12th Int'l World Wide Web Conf. (WWW '03), pp. 519-528, 2003.- [8] J. Liu, Y. Cao, C.-Y. Lin, Y. Huang, and M. Zhou, "Low-Quality Product Review Detection in Opinion Summarization,"
Proc. Joint Conf. Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL '07), pp. 334-342, 2007.- [9] N. Jindal and B. Liu, "Opinion Spam and Analysis,"
Proc. Int'l Conf. Web Search and Web Data Mining (WSDM '08), pp. 219-230, 2008.- [10] X. Zhu, "Semi-Supervised Learning Literature Survey," Technical Report 1530, Computer Sciences, Univ. of Wisconsin-Madison, 2005.
- [11] O. Chapelle, V. Sindhwani, and S.S. Keerthi, "Optimization Techniques for Semi-Supervised Support Vector Machines,"
J. Machine Learning Research, vol. 9, pp. 203-233, 2008.- [12] R. Collobert, F.H. Sinz, J. Weston, and L. Bottou, "Large Scale Transductive Svms,"
J. Machine Learning Research, vol. 7, pp. 1687-1712, 2006.- [13] O. Chapelle, M. Chi, and A. Zien, "A Continuation Method for Semi-Supervised Svms,"
Proc. 23rd Int'l Conf. Machine Learning (ICML '06), pp. 185-192, 2006.- [14] T. Joachims, "Transductive Inference for Text Classification Using Support Vector Machines,"
Proc. 16th Int'l Conf. Machine Learning (ICML '99), pp. 200-209, 1999.- [15] A.L. Yuille and A. Rangarajan, "The Concave-Convex Procedure (cccp),"
Proc. Advances in Neural Information Processing Systems (NIPS), pp. 1033-1040, 2001.- [16] A. Blum and S. Chawla, "Learning from Labeled and Unlabeled Data Using Graph Mincuts,"
Proc. 18th Int'l Conf. Machine Learning (ICML '01), pp. 19-26, 2001.- [17] X. Zhu, Z. Ghahramani, and J.D. Lafferty, "Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions,"
Proc. 20th Int'l Conf. Machine Learning (ICML '03), pp. 912-919, 2003.- [18] D. Zhou, O. Bousquet, T.N. Lal, J. Weston, and B. Schölkopf, "Learning with Local and Global Consistency,"
Proc. Advances in Neural Information Processing Systems (NIPS), pp. 321-328, 2004.- [19] M. Karlen, J. Weston, A. Erkan, and R. Collobert, "Large Scale Manifold Transduction,"
Proc. 25th Int'l Conf. Machine Learning (ICML '08), pp. 448-455, 2008.- [20] M. Belkin, P. Niyogi, and V. Sindhwani, "Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples,"
J. Machine Learning Research, vol. 7, pp. 2399-2434, 2006.- [21] M. Belkin and P. Niyogi, "Using Manifold Structure for Partially Labeled Classification,"
Proc. Advances in Neural Information Processing Systems (NIPS), pp. 929-936, 2002.- [22] F.R.K. Chung,
Spectral Graph Theory, no. 92. Am. Math. Soc., (CBMS Regional Conf. Series in Math.), 1997.- [23] V.R. de Sa, "Learning Classification with Unlabeled Data,"
Proc. Advances in Neural Information Processing Systems (NIPS), pp. 112-119, 1994.- [24] A. Blum and T. Mitchell, "Combining Labeled and Unlabeled Data with Co-Training,"
Proc. 11th Ann. Conf. Computational Learning Theory (COLT '98), pp. 92-100, 1998.- [25] K. Nigam and R. Ghani, "Analyzing the Effectiveness and Applicability of Co-Training,"
Proc. ACM Int'l Conf. Information and Knowledge Management (CIKM '00), pp. 86-93, 2000.- [26] U. Brefeld and T. Scheffer, "Co-em Support Vector Learning,"
Proc. 21st Int'l Conf. Machine Learning (ICML '04), pp. 121-128, 2004.- [27] W.W. 0028 and Z.-H. Zhou, "A New Analysis of Co-Training,"
Proc. 27th Int'l Conf. Machine Learning (ICML '10), pp. 1135-1142, 2010.- [28] Ü. Güz, S. Cuendet, D. Hakkani-Tür, and G. Tür, "Multi-View Semi-Supervised Learning for Dialog Act Segmentation of Speech,"
IEEE Trans. Audio, Speech and Language Processing, vol. 18, no. 2, pp. 320-329, Feb. 2010.- [29] C. Christoudias, R. Urtasun, and T. Darrell, "Multi-View Learning in the Presence of View Disagreement,"
Proc. 24th Conf. Uncertainty in Artificial Intelligence (UAI '08), pp. 88-96, 2008.- [30] I. Muslea, S. Minton, and C.A. Knoblock, "Active + Semi-Supervised Learning = Robust Multi-View Learning,"
Proc. 19th Int'l Conf. Machine Learning (ICML '02), pp. 435-442, 2002.- [31] S. Yu, B. Krishnapuram, R. Rosales, H. Steck, and R.B. Rao, "Bayesian Co-Training,"
Proc. Advances in Neural Information Processing Systems (NIPS), pp. 1665-1672, 2007.- [32] V. Sindhwani and P. Niyogi, "A Co-Regularized Approach to Semi-Supervised Learning with Multiple Views,"
Proc. ICML Workshop Learning with Multiple Views, pp. 74-79, 2005.- [33] V. Sindhwani and D.S. Rosenberg, "An Rkhs for Multi-View Learning and Manifold Co-Regularization,"
Proc. 25th Int'l Conf. Machine Learning (ICML '08), pp. 976-983, 2008.- [34] O.-A. Maillard and N. Vayatis, "Complexity versus Agreement for Many Views,"
Proc. 20th Int'l Conf. Algorithmic Learning Theory (ALT '09), pp. 232-246, 2009.- [35] U. Brefeld, T. Gärtner, T. Scheffer, and S. Wrobel, "Efficient Co-Regularised Least Squares Regression,"
Proc. 23rd Int'l Conf. Machine Learning (ICML '06), pp. 137-144, 2006.- [36] D.R. Hardoon, S. Szedmák, and J. Shawe-Taylor, "Canonical Correlation Analysis: An Overview with Application to Learning Methods,"
Neural Computation, vol. 16, no. 12, pp. 2639-2664, 2004.- [37] S. Szedmák and J. Shawe-Taylor, "Synthesis of Maximum Margin and Multiview Learning Using Unlabeled Data,"
Neurocomputing, vol. 70, nos. 7-9, pp. 1254-1264, 2007.- [38] J. Nocedal and S.J. Wright,
Numerical Optimization. Springer, 2000.- [39] D.P. Bertsekas,
Constrained Optimization and Lagrange Multiplier Methods, first ed. Athena Scientific, (Optimization and Neural Computation Series), 1996.- [40] N. Kushmerick, "Learning to Remove Internet Advertisements,"
Proc. Third Int'l Conf. Autonomous Agents, pp. 175-181, 1999.- [41] C.-C. Chang and C.-J. Lin, "LIBSVM: A Library for Support Vector Machines,"
Science, vol. 2, pp. 1-39, http://www.csie.ntu.edu.tw/~cjlinlibsvm, 2001. |