The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.12 - December (2010 vol.22)
pp: 1664-1678
Fuzhen Zhuang , Chinese Academy of Sciences, Beijing
Ping Luo , Hewlett-Packard Labs, Beijing
Hui Xiong , Rutgers University, Newark
Yuhong Xiong , Hewlett-Packard Labs, Beijing
Qing He , Chinese Academy of Sciences, Beijing
Zhongzhi Shi , Chinese Academy of Sciences, Beijing
ABSTRACT
Classification across different domains studies how to adapt a learning model from one domain to another domain which shares similar data characteristics. While there are a number of existing works along this line, many of them are only focused on learning from a single source domain to a target domain. In particular, a remaining challenge is how to apply the knowledge learned from multiple source domains to a target domain. Indeed, data from multiple source domains can be semantically related, but have different data distributions. It is not clear how to exploit the distribution differences among multiple source domains to boost the learning performance in a target domain. To that end, in this paper, we propose a consensus regularization framework for learning from multiple source domains to a target domain. In this framework, a local classifier is trained by considering both local data available in one source domain and the prediction consensus with the classifiers learned from other source domains. Moreover, we provide a theoretical analysis as well as an empirical study of the proposed consensus regularization framework. The experimental results on text categorization and image classification problems show the effectiveness of this consensus regularization learning method. Finally, to deal with the situation that the multiple source domains are geographically distributed, we also develop the distributed version of the proposed algorithm, which avoids the need to upload all the data to a centralized location and helps to mitigate privacy concerns.
INDEX TERMS
Classification, multiple source domains, cross-domain learning, consensus regularization.
CITATION
Fuzhen Zhuang, Ping Luo, Hui Xiong, Yuhong Xiong, Qing He, Zhongzhi Shi, "Cross-Domain Learning from Multiple Sources: A Consensus Regularization Perspective", IEEE Transactions on Knowledge & Data Engineering, vol.22, no. 12, pp. 1664-1678, December 2010, doi:10.1109/TKDE.2009.205
REFERENCES
[1] P. Luo, F.Z. Zhuang, H. Xiong, Y.H. Xiong, and Q. He, "Transfer Learning from Multiple Source Domains via Consensus Regularization," Proc. 17th ACM Conf. Information and Knowledge Mining (CIKM), pp. 103-112, 2008.
[2] W. Dai, Q. Yang, G. Xue, and Y. Yu, "Boosting for Transfer Learning," Proc. 24th Int'l Conf. Machine Learning (ICML), pp. 193-200, 2007.
[3] W. Dai, G. Xue, Q. Yang, and Y. Yu, "Co-Clustering Based Classification for Out-of-Domain Documents," Proc. 13th ACM SIGKDD, pp. 210-219, 2007.
[4] D. Xing, W. Dai, G. Xue, and Y. Yu, "Bridged Refinement for Transfer Learning," Proc. 11th European Conf. Practice of Knowledge Discovery in Databases (PKDD), pp. 324-335, 2007.
[5] A. Smeaton and P. Over, "TRECVID: Benchmarking the Effectiveness of Information Retrieval Tasks on Digital Video," Proc. Image and Video Retrieval, pp. 451-456, 2003.
[6] J. Yang, R. Yan, and A.G. Hauptmann, "Cross-Domain Video Concept Detection Using Adaptive SVMs," Proc. 15th Int'l Conf. Multimedia, pp. 188-197, 2007.
[7] D. Hosmer and S. Lemeshow, Applied Logistic Regression. Wiley, 2000.
[8] A. Ruszczynski, Nonlinear Optimization. Princeton Univ. Press, 2006.
[9] L. Zhang, "The Research on Human-Computer Cooperation in Content-Based Image Retrieval," PhD thesis, Tsinghua Univ., Beijing, 2001 (in Chinese).
[10] Z.P. Shi, F. Ye, Q. He, and Z.Z. Shi, "Symmetrical Invariant LBP Texture Descriptor and Application for Image Retrieval," Proc. Congress on Image and Signal Processing, pp. 825-829, 2008.
[11] T. Joachims, "Transductive Inference for Text Classification Using Support Vector Machines," Proc. 16th Int'l Conf. Machine Learning (ICML), pp. 200-209, 1999.
[12] T. Joachims, "Transductive Learning via Spectral Graph Partitioning," Proc. 20th Int'l Conf. Machine Learning (ICML), pp. 290-297, 2003.
[13] B. Leskes and L. Torenvliet, "The Value of Agreement, A New Boosting Algorithm," J. Computer and System Sciences, vol. 74, no. 4, pp. 557-586, 2005.
[14] T.G. Dietterich, "Ensemble Methods in Machine Learning," Lecture Notes in Computer Science, vol. 1857, pp. 1-15, Springer, 2000.
[15] X. Liao, Y. Xue, and L. Carin, "Logistic Regression with an Auxiliary Data Source," Proc. 22nd Int'l Conf. Machine Learning (ICML), pp. 505-512, 2005.
[16] L. Duan, I.W. Tsang, D. Xu, and T.S. Chua, "Domain Adaptation from Multiple Sources via Auxiliary Classifiers," Proc. 26th Int'l Conf. Machine Learning (ICML), pp. 289-296, 2009.
[17] X. Ling, W.Y. Dai, G.R. Xue, Q. Yang, and Y. Yu, "Spectral Domain-Transfer Learning," Proc. 14th ACM SIGKDD, pp. 488-496, 2008.
[18] J. Gao, W. Fan, Y.Z. Sun, and J.W. Han, "Heterogeneous Source Consensus Learning via Decision Propagation and Negotiation," Proc. 15th ACM SIGKDD, 2009.
[19] J. Jiang, "Domain Adaptation in Natural Language Processing," PhD thesis, Dept. of Computer Science, Graduate College of the Univ. of Illinois at Urbana-Champaign, 2008.
[20] J. Gao, W. Fan, J. Jiang, and J.W. Han, "Knowledge Transfer via Multiple Model Local Structure Mapping," Proc. 14th ACM SIGKDD, pp. 283-291, 2008.
[21] Y. Grandvalet and Y. Bengio, "Semi-Supervised Learning by Entropy Minimization," Proc. 19th Conf. Neural Information Processing Systems (NIPS), pp. 529-536, 2005.
[22] D. Yarowsky, "Unsupervised Word Sense Disambiguation Rivaling Supervised Methods," Proc. 33rd Ann. Meeting of the Assoc. for Computational Linguistics (ACL), pp. 189-196, 1995.
[23] A. Blum and T. Mitchell, "Combining Labeled and Unlabeled Data with Co-Training," Proc. 11th Ann. Conf. Computational Learning Theory, pp. 92-100, 1998.
[24] V. Sindhwani, P. Niyogi, and M. Belkin, "A Co-Regularization Approach to Semi-Supervised Learning with Multiple Views," Proc. 22nd Int'l Conf. Machine Learning (ICML) Workshop Learning with Multiple Views, pp. 74-79, 2005.
[25] S. Dasgupta, M.L. Littman, and D.A. McAllester, "PAC Generalization Bounds for Co-Training," Proc. 15th Conf. Neural Information Processing Systems (NIPS), pp. 375-382, 2001.
[26] S. Abney, "Bootstrapping," Proc. 40th Ann. Meeting of the Assoc. for Computational Linguistics (ACL), 2002.
[27] S. Abney, "Understanding the Yarowsky Algorithm," Computational Linguistics, vol. 30, no. 3, pp. 365-395, 2004.
18 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool