
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
Akinori Fujino, Naonori Ueda, Kazumi Saito, "Semisupervised Learning for a Hybrid Generative/Discriminative Classifier based on the Maximum Entropy Principle," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30, no. 3, pp. 424437, March, 2008.  
BibTex  x  
@article{ 10.1109/TPAMI.2007.70710, author = {Akinori Fujino and Naonori Ueda and Kazumi Saito}, title = {Semisupervised Learning for a Hybrid Generative/Discriminative Classifier based on the Maximum Entropy Principle}, journal ={IEEE Transactions on Pattern Analysis and Machine Intelligence}, volume = {30}, number = {3}, issn = {01628828}, year = {2008}, pages = {424437}, doi = {http://doi.ieeecomputersociety.org/10.1109/TPAMI.2007.70710}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Pattern Analysis and Machine Intelligence TI  Semisupervised Learning for a Hybrid Generative/Discriminative Classifier based on the Maximum Entropy Principle IS  3 SN  01628828 SP424 EP437 EPD  424437 A1  Akinori Fujino, A1  Naonori Ueda, A1  Kazumi Saito, PY  2008 KW  generative model KW  maximum entropy principle KW  bias correction KW  unlabeled samples KW  text classification VL  30 JA  IEEE Transactions on Pattern Analysis and Machine Intelligence ER   
[1] K. Nigam, A. McCallum, S. Thrun, and T. Mitchell, “Text Classification from Labeled and Unlabeled Documents Using EM,” Machine Learning, vol. 39, pp. 103134, 2000.
[2] Y. Grandvalet and Y. Bengio, “SemiSupervised Learning by Entropy Minimization,” Advances in Neural Information Processing Systems 17, MIT Press, pp. 529536, 2005.
[3] M. Szummer and T. Jaakkola, “Kernel Expansions with Unlabeled Examples,” Advances in Neural Information Processing Systems 13, MIT Press, pp. 626632, 2001.
[4] M. Inoue and N. Ueda, “Exploitation of Unlabeled Sequences in Hidden Markov Models,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 12, pp. 15701581, Dec. 2003.
[5] M.R. Amini and P. Gallinari, “SemiSupervised Logistic Regression,” Proc. 15th European Conf. Artificial Intelligence, pp. 390394, 2002.
[6] T. Joachims, “Transductive Inference for Text Classification Using Support Vector Machines,” Proc. 16th Int'l Conf. Machine Learning, pp. 200209, 1999.
[7] A. Blum and T. Mitchell, “Combining Labeled and Unlabeled Data with CoTraining,” Proc. 11th Ann. Conf. Computational Learning Theory, vol. 11, 1998.
[8] X. Zhu, Z. Ghahramani, and J. Lafferty, “SemiSupervised Learning Using Gaussian Fields and Harmonic Functions,” Proc. 20th Int'l Conf. Machine Learning, pp. 912919, 2003.
[9] M. Seeger, “Learning with Labeled and Unlabeled Data,” technical report, Univ. of Edinburgh, 2001.
[10] A.Y. Ng and M.I. Jordan, “On Discriminative vs. Generative Classifiers: A Comparison of Logistic Regression and Naive Bayes,” Advances in Neural Information Processing Systems 14, pp.841848, MIT Press, 2002.
[11] S. Tong and D. Koller, “Restricted Bayes Optimal Classifiers,” Proc. 17th Nat'l Conf. Artificial Intelligence, pp. 658664, 2000.
[12] R. Raina, Y. Shen, A.Y. Ng, and A. McCallum, “Classification with Hybrid Generative/Discriminative Models,” Advances in Neural Information Processing Systems 16, MIT Press, 2004.
[13] A.L. Berger, S.A. Della Pietra, and V.J. Della Pietra, “A Maximum Entropy Approach to Natural Language Processing,” Computational Linguistics, vol. 22, no. 1, pp. 3971, 1996.
[14] A. Fujino, N. Ueda, and K. Saito, “A Hybrid Generative/Discriminative Approach to SemiSupervised Classifier Design,” Proc. 20th Nat'l Conf. Artificial Intelligence, pp. 764769, 2005.
[15] A. Fujino, N. Ueda, and K. Saito, “SemiSupervised Learning on Hybrid Generative/Discriminative Models,” Information Technology Letters, vol. 4, pp. 161164, 2005, in Japanese.
[16] A.P. Dempster, N.M. Laird, and D.B. Rubin, “Maximum Likelihood from Incomplete Data via the EM Algorithm,” J. Royal Statistical Soc. B, vol. 39, pp. 138, 1977.
[17] F.G. Cozman and I. Cohen, “Unlabeled Data Can Degrade Classification Performance of Generative Classifiers,” Proc. 15th Int'l Florida Artificial Intelligence Research Soc. Conf., pp. 327331, 2002.
[18] T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction. SpringerVerlag, 2001.
[19] K. Nigam, J. Lafferty, and A. McCallum, “Using Maximum Entropy for Text Classification,” Proc. Int'l Joint Conf. Artificial Intelligence Workshop Machine Learning for Information Filtering, pp.6167, 1999.
[20] S.F. Chen and R. Rosenfeld, “A Gaussian Prior for Smoothing Maximum Entropy Models,” technical report, Carnegie Mellon Univ., 1999.
[21] D.C. Liu and J. Nocedal, “On the Limited Memory BFGS Method for Large Scale Optimization,” Math. Programming B, vol. 45, no. 3, pp. 503528, 1989.
[22] A. Fujino, N. Ueda, and K. Saito, “A Hybrid Generative/Discriminative Approach to Text Classification with Additional Information,” Information Processing and Management, vol. 43, pp.379392, 2007.
[23] Y. Yang and X. Liu, “A ReExamination of Text Categorization Methods,” Proc. 22nd ACM Int'l Conf. Research and Development in Information Retrieval, pp. 4249, 1999.
[24] G. Salton and M.J. McGill, Introduction to Modern Information Retrieval. McGrawHill, 1983.
[25] G. Forman, “An Extensive Empirical Study of Feature Selection Metrics for Text Classification,” J. Machine Learning Research, vol. 3, pp. 12891305, 2003.
[26] R. Bekkerman, R. ElYaniv, N. Tishby, and Y. Winter, “On Feature Distributional Clustering for Text Classification,” Proc. 24th ACM Int'l Conf. Research and Development in Information Retrieval, pp.146153, 2001.
[27] J. Demšar, “Statistical Comparisons of Classifiers over Multiple Data Sets,” J. Machine Learning Research, vol. 7, pp. 130, 2006.
[28] D.J. Miller and H.S. Uyar, “A Mixture of Experts Classifier with Learning Based on Both Labelled and Unlabelled Data,” Advances in Neural Information Processing Systems 9, pp. 571577, MIT Press, 1997.
[29] N.V. Chawla and G. Karakoulas, “Learning from Labeled and Unlabeled Data: An Empirical Study across Techniques and Domains,” J. Artificial Intelligence Research, vol. 23, pp. 331366, 2005.
[30] I.S. Dhillon and D.S. Modha, “Concept Decompositions for Large Sparse Text Data Using Clustering,” Machine Learning, vol. 42, pp.143175, 2001.
[31] C.D. Manning and H. Schütze, Foundations of Statistical Natural Language Processing. The MIT Press, 1999.
[32] F. Jelinek and R. Mercer, “Interpolated Estimation of Markov Source Parameters from Sparse Data,” Pattern Recognition in Practice, E.S. Gelsema and L.N. Kanal, eds., pp. 381402, North Holland Publishing, 1980.