The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.05 - May (2010 vol.32)
pp: 770-787
Lorenzo Bruzzone , University of Trento, Trento
Mattia Marconcini , University of Trento, Trento
ABSTRACT
This paper addresses pattern classification in the framework of domain adaptation by considering methods that solve problems in which training data are assumed to be available only for a source domain different (even if related) from the target domain of (unlabeled) test data. Two main novel contributions are proposed: 1) a domain adaptation support vector machine (DASVM) technique which extends the formulation of support vector machines (SVMs) to the domain adaptation framework and 2) a circular indirect accuracy assessment strategy for validating the learning of domain adaptation classifiers when no true labels for the target--domain instances are available. Experimental results, obtained on a series of two-dimensional toy problems and on two real data sets related to brain computer interface and remote sensing applications, confirmed the effectiveness and the reliability of both the DASVM technique and the proposed circular validation strategy.
INDEX TERMS
Domain adaptation, transfer learning, semi-supervised learning, support vector machines, accuracy assessment, validation strategy.
CITATION
Lorenzo Bruzzone, Mattia Marconcini, "Domain Adaptation Problems: A DASVM Classification Technique and a Circular Validation Strategy", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.32, no. 5, pp. 770-787, May 2010, doi:10.1109/TPAMI.2009.57
REFERENCES
[1] R. Caruana, "Multitask Learning," Machine Learning J., vol. 28, no. 1, pp. 41-75, 1997.
[2] S. Thrun and L.Y. Pratt, Learning to Learn. Kluwer Academic Publishers, 1998.
[3] S. Ben-David and R. Schuller, "Exploiting Task Relatedness for Multiple Task Learning," Proc. 16th Ann. Conf. Learning Theory, 2003.
[4] B. Zadrozny, "Learning and Evaluating Classifiers under Sample Selection Bias," Proc. 21st Int'l Conf. Machine Learning, 2004.
[5] M. Dudik, R.E. Schapire, and J.S. Philips, "Correcting Sample Selection Bias in Maximum Entropy Density Estimation," Advances in Neural Information Processing Systems 17, MIT Press, 2005.
[6] J. Huang, A. Smola, A. Gretton, K.M. Borgwardt, and B. Schölkopf, "Correcting Sample Selection Bias by Unlabeled Data," Advances in Neural Information Processing Systems 20, MIT Press, 2007.
[7] H. Shimodaira, "Improving Predictive Inference under Covariate Shift by Weighting the Loglikelihood Function" J. Statistical Planning and Inference, vol. 90, pp. 227-244, 2000.
[8] M. Sugiyama and K.R. Müller, "Input-Dependent Estimation of Generalization Error under Covariate Shift," Statistics and Decisions, vol. 23, pp. 249-279, 2005.
[9] H. Jeffreys, "An Invariant Form for the Prior Probability in Estimation Problems," Proc. Royal Soc. London, vol. 186, pp. 453-461, 1946.
[10] S. Kullback and R. Leibler, "On Information and Sufficiency," Annals of Math. Statistics, vol. 22, pp. 79-86, 1951.
[11] J. Lin, "Divergence Measures Based on the Shannon Entropy," IEEE Trans. Information Theory, vol. 37, pp. 145-151, 1991.
[12] V.N. Vapnik, Statistical Learning Theory. John Wiley & Sons, Inc., 1998.
[13] V.N. Vapnik, The Nature of Statistical Learning Theory, second ed. Springer-Verlag, 1995.
[14] M. Pontil and A. Verri, "Support Vector Machines for 3D Object Recognition," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 6, pp. 637-646, June 1998.
[15] G. Ratsch, S. Mika, B. Schölkopf, and K.R. Muller, "Constructing Boosting Algorithms from SVMs: An Application to One-Class Classification," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 9, pp. 1184-1199, Sept. 2002.
[16] K. In Kim, K. Jung, S.H. Park, and H.J. Kim, "Support Vector Machines for Texture Classification," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 11, pp. 1542-1550, Nov. 2002.
[17] K.P. Bennett and A. Demiriz, "Semi-Supervised Support Vector Machines," Advances in Neural Information Processing Systems, vol. 10, pp. 368-374, MIT Press, 1998.
[18] G. Fung and O.L. Mangasarian, "Semi-Supervised Support Vector Machines for Unlabeled Data Classification," Optimization Methods and Software, vol. 15, no. 1, 2001.
[19] X. Zhu, "Semi-Supervised Learning Literature Survey," TR-1530, Computer Sciences, Univ. of Wisconsin-Madison, 2005.
[20] T. Joachims, "Transductive Inference for Text Classification Using Support Vector Machines," Proc. 16th Int'l Conf. Machine Learning, 1999.
[21] Y. Chen, G. Wang, and S. Dong, "Learning with Progressive Transductive Support Vector Machine," Pattern Recognition Letters, vol. 24, no. 12, pp. 1845-1855, 2003.
[22] R. Hwa, "Supervised Grammar Induction Using Training Data with Limited Constituent Information," Proc. 37th Ann. Meeting of the Assoc. for Computational Linguistics, 1999.
[23] D. Gildea, "Corpus Variation and Parser Performance," Proc. 2001 Conf. Empirical Methods in Natural Language Processing, 2001.
[24] B. Roark and M. Bacchiani, "Supervised and Unsupervised PCFG Adaptation to Novel Domains," Proc. 2003 Conf. North Am. Chapter of the Assoc. for Computational Linguistics and Human Language Technology, 2003.
[25] X. Li and J. Bilmes, "A Bayesian Divergence Prior for Classifier Adaptation," Proc. 11th Int'l Conf. Artificial Intelligence and Statistics, 2007.
[26] C. Chelba and A. Acero, "Adaptation of Maximum Entropy Capitalizer: Little Data Can Help a Lot," Proc. 2004 Conf. Empirical Methods in Natural Language Processing, 2004.
[27] H. DaumèIII and D. Marcu, "Domain Adaptation for Statistical Classifiers," J. Artificial Intelligence Research, vol. 26, pp. 101-126, 2006.
[28] J. Jiang and C. Zhai, "Instance Weighting for Domain Adaptation in NLP," Proc. 45th Ann. Meeting of the Assoc. for Computational Linguistics, 2007.
[29] R. Florian, H. Hassan, A. Ittycheriah, H. Jing, N. Kambhatla, X. Luo, N. Nicolov, and S. Roukos, "A Statistical Model for Multilingual Entity Detection and Tracking," Proc. 2004 Conf. North Am. Chapter of the Assoc. for Computational Linguistics and Human Language Technology, 2004.
[30] H. DaumèIII, "Frustratingly Easy Domain Adaptation," Proc. 45th Ann. Meeting of the Assoc. for Computational Linguistics, 2007.
[31] J. Blitzer, R. McDonald, and F. Pereira, "Domain Adaptation with Structural Correspondence Learning," Proc. 2006 Conf. Empirical Methods in Natural Language Processing, 2006.
[32] S. Ben-David, J. Blitzer, K. Crammer, and F. Pereira, "Analysis of Representation for Domain Adaptation," Advances in Neural Information Processing Systems 19, MIT Press, 2006.
[33] S. Satpal and S. Sarawagi, "Domain Adaptation of Conditional Probability Models via Feature Subsetting," Proc. 11th European Conf. Principles and Practice of Knowledge Discovery in Databases, 2007.
[34] W. Dai, G.R. Xue, Q. Yang, and Y. Yu, "Transferring Naïve Bayes Classifier for Text Classification," Proc. 22nd Nat'l Conf. Artificial Intelligence, 2007.
[35] W. Dai, G.R. Xue, Q. Yang, and Y. Yu, "Co-Clustering Based Classification for Out-of-Domain Documents," Proc. ACM SIGKDD, 2007.
[36] L. Bruzzone and D. Fernàndez Prieto, "Unsupervised Retraining of a Maximum-Likelihood Classifier for the Analysis of Multitemporal Remote-Sensing Images," IEEE Trans. Geosciences and Remote Sensing, vol. 39, pp. 456-460, 2001.
[37] L. Bruzzone and D. Fernàndez Prieto, "A Partially Unsupervised Approach to the Automatic Classification of Multitemporal Remote-Sensing Images," Pattern Recognition Letters, vol. 33, no. 9, pp. 1063-1071, 2002.
[38] L. Bruzzone and R. Cossu, "A Multiple-Cascade-Classifier System for a Robust and Partially Unsupervised Updating of Land-Cover Maps," IEEE Trans. Geosciences and Remote Sensing, vol. 40, no. 9, pp. 1984-1996, Sept. 2002.
[39] L. Bruzzone, R. Cossu, and G. Vernazza, "Combining Parametric and Non-Parametric Algorithms for a Partially Unsupervised Classification of Multitemporal Remote-Sensing Images," Information Fusion, vol. 3, no. 4, pp. 289-297, 2002.
[40] S. Tajudin and D. Landgrebe, "Robust Parameter Estimation for Mixture Model," IEEE Trans. Geoscience and Remote Sensing, vol. 38, pp. 439-445, 2000.
[41] N. Cristianini and J. Shawe-Taylor, An Introduction to Support Vector Machines. Cambridge Univ. Press, 2000.
[42] J. Platt, "Fast Training of Support Vector Machines Using Sequential Minimal Optimization," Advances in Kernel Methods: Support Vector Learning, B. Schölkopf, C. Burges, and A. Smola, eds., pp. 185-208, MIT Press, 1998.
[43] http://ida.first.fraunhofer.de/projects/ bcicompetition_iii/, 2008.
[44] T. Lal, T. Hinterberger, G. Widman, M. Schröder, J. Hill, W. Rosenstiel, C. Elger, B. Schölkopf, and N. Birbaumer, "Methods Towards Invasive Human Brain Computer Interfaces," Advances in Neural Information Processing Systems 17, MIT Press, 2004.
[45] C. Toro, G. Deuschl, R. Thatcher, S. Sato, C. Kufta, and M. Hallett, "Event-Related Desynchronization and Movement-Related Cortical Potentials on the ECoG and EEG," Electroencephalography and Clinical Neurophysiology, vol. 93, pp. 380-389, 1994.
[46] C. Babiloni, F. Carducci, F. Cincotti, P.M. Rossini, C. Neuper, G. Pfurtscheller, and F. Babiloni, "Human Movement-Related Potentials vs Desynchronization of EEG Alpha Rhythm: A High-Resolution EEG Study," Neuroimage, vol. 10, pp. 658-665, 1999.
[47] Y. Wang, P. Berg, and M. Scherg, "Common Spatial Subspace Decomposition Applied to Analysis of Brain Responses Under Multiple Task Conditions: A Simulation Study," Clinical Neurophysiology, vol. 110, pp. 604-614, 1999.
6 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool