
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
Ira Cohen, Fabio G. Cozman, Nicu Sebe, Marcelo C. Cirelo, Thomas S. Huang, "Semisupervised Learning of Classifiers: Theory, Algorithms, and Their Application to HumanComputer Interaction," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 12, pp. 15531567, December, 2004.  
BibTex  x  
@article{ 10.1109/TPAMI.2004.127, author = {Ira Cohen and Fabio G. Cozman and Nicu Sebe and Marcelo C. Cirelo and Thomas S. Huang}, title = {Semisupervised Learning of Classifiers: Theory, Algorithms, and Their Application to HumanComputer Interaction}, journal ={IEEE Transactions on Pattern Analysis and Machine Intelligence}, volume = {26}, number = {12}, issn = {01628828}, year = {2004}, pages = {15531567}, doi = {http://doi.ieeecomputersociety.org/10.1109/TPAMI.2004.127}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Pattern Analysis and Machine Intelligence TI  Semisupervised Learning of Classifiers: Theory, Algorithms, and Their Application to HumanComputer Interaction IS  12 SN  01628828 SP1553 EP1567 EPD  15531567 A1  Ira Cohen, A1  Fabio G. Cozman, A1  Nicu Sebe, A1  Marcelo C. Cirelo, A1  Thomas S. Huang, PY  2004 KW  Semisupervised learning KW  generative models KW  facial expression recognition KW  face detection KW  unlabeled data KW  Bayesian network classifiers. VL  26 JA  IEEE Transactions on Pattern Analysis and Machine Intelligence ER   
[1] B. Shahshahani and D. Landgrebe, “Effect of Unlabeled Samples in Reducing the Small Sample Size Problem and Mitigating the Hughes Phenomenon,” IEEE Trans. Geoscience and Remote Sensing, vol. 32, no. 5, pp. 10871095, 1994.
[2] T. Zhang and F. Oles, “A Probability Analysis on the Value of Unlabeled Data for Classification Problems,” Proc. Int'l Conf. Machine Learning (ICML), pp. 11911198, 2000.
[3] K. Nigam, A. McCallum, S. Thrun, and T. Mitchell, “Text Classification from Labeled and Unlabeled Documents Using EM,” Machine Learning, vol. 39, no. 2, pp. 103134, 2000.
[4] R. Bruce, “SemiSupervised Learning Using Prior Probabilities and EM,” Proc. Int'l Joint Conf. AI Workshop Text Learning: Beyond Supervision, 2001.
[5] S. Baluja, “Probabilistic Modelling for Face Orientation Discrimination: Learning from Labeled and Unlabeled Data,” Proc. Neural Information and Processing Systems (NIPS), pp. 854860, 1998.
[6] R. Kohavi, “Scaling Up the Accuracy of Naive Bayes Classifiers: A DecisionTree Hybrid,” Proc. Second Int't Conf. Knowledge Discovery and Data Mining, pp. 202207, 1996.
[7] I. Cohen, F.G. Cozman, and A. Bronstein, “On the Value of Unlabeled Data in SemiSupervised Learning Based on MaximumLikelihood Estimation,” Technical Report HPL2002140, HewlettPackard Labs, 2002.
[8] J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. San Mateo, Calif.: Morgan Kaufmann, 1988.
[9] A. Garg, V. Pavlovic, and J. Rehg, “Boosted Learning in Dynamic Bayesian Networks for Multimodal Speaker Detection,” Proc. IEEE, vol. 91, pp. 13551369, Sept. 2003.
[10] N. Oliver, E. Horvitz, and A. Garg, “Hierarchical Representations for Learning and Inferring Office Activity from Multimodal Information,” Proc. Int'l Conf. Multimodal Interfaces, (ICMI), 2002.
[11] N. Friedman, D. Geiger, and M. Goldszmidt, “Bayesian Network Classifiers,” Machine Learning, vol. 29, no. 2, pp. 131163, 1997.
[12] R. Greiner and W. Zhou, “Structural Extension to Logistic Regression: Discriminative Parameter Learning of Belief Net Classifiers,” Proc. Ann. Nat'l Conf. Artificial Intelligence, pp. 167173, 2002.
[13] P. Ekman and W. Friesen, Facial Action Coding System: Investigator's Guide. Palo Alto, Calif.: Consulting Psychologists Press, 1978.
[14] C.L. Blake and C.J. Merz, “UCI Repository of Machine Learning Databases,” Dept. of Information and Computer Sciences, Univ. of California, Irvine, 1998.
[15] L. Devroye, L. Gyorfi, and G. Lugosi, A Probabilistic Theory of Pattern Recognition. New York: Springer Verlag, 1996.
[16] A. Corduneanu and T. Jaakkola, “Continuations Methods for Mixing Heterogeneous Sources,” Proc. Uncertainty in Artificial Intelligence (UAI), pp. 111118, 2002.
[17] R. Chhikara and J. McKeon, “Linear Discriminant Analysis with Misallocation in Training Samples,” J. Am. Statistical Assoc., vol. 79, pp. 899906, 1984.
[18] C. Chittineni, “Learning with Imperfectly Labeled Examples,” Pattern Recognition, vol. 12, pp. 271281, 1981.
[19] T. Krishnan and S. Nandy, “Efficiency of Discriminant Analysis when Initial Samples Are Classified Stochastically,” Pattern Recognition, vol. 23, pp. 529537, 1990.
[20] T. Krishnan and S. Nandy, “Efficiency of LogisticNormal supervision,” Pattern Recognition, vol. 23, pp. 12751279, 1990.
[21] S. Pal and E.A. Pal, Pattern Recognition from Classical to Modern Approaches. World Scientific, 2002.
[22] D.B. Cooper and J.H. Freeman, “On the Asymptotic Improvement in the Outcome of Supervised Learning Provided by Additional Nonsupervised Learning,” IEEE Trans. Computers, vol. 19, no. 11, pp. 10551063, Nov. 1970.
[23] D.W. Hosmer, “A Comparison of Iterative Maximum Likelihood Estimates of the Parameters of a Mixture of Two Normal Distributions under Three Different Types of Sample,” Biometrics, vol. 29, pp. 761770, Dec. 1973.
[24] T.J. O'Neill, “Normal Discrimination with Unclassified Observations,” J. Am. Statistical Assoc., vol. 73, no. 364, pp. 821826, 1978.
[25] S. Ganesalingam and G.J. McLachlan, “The Efficiency of a Linear Discriminant Function Based on Unclassified Initial Samples,” Biometrika, vol. 65, pp. 658662, Dec. 1978.
[26] V. Castelli, “The Relative Value of Labeled and Unlabeled Samples in Pattern Recognition,” PhD thesis, Stanford Univ., Palo Alto, Calif., 1994.
[27] J. Ratsaby and S.S. Venkatesh, “Learning from a Mixture of Labeled and Unlabeled Examples with Parametric Side Information,” Proc. Eighth Ann. Conf. Computational Learning Theory, pp. 412417, 1995.
[28] T. Mitchell, “The Role of Unlabeled Data in Supervised Learning,” Proc. Sixth Int'l Colloquium Cognitive Science, 1999.
[29] D.J. Miller and H.S. Uyar, “A Mixture of Experts Classifier with Learning Based on Both Labelled and Unlabelled Data,” Neural Information and Processing Systems (NIPS), pp. 571577, 1996.
[30] M. Collins and Y. Singer, “Unupervised Models for Named Entity Classification,” Proc. Int'l Conf. Machine Learning, pp. 327334, 2000.
[31] F. DeComite, F. Denis, R. Gilleron, and F. Letouzey, “Positive and Unlabeled Examples Help Learning,” Proc. 10th Int'l Conf. Algorithmic Learning Theory, O. Watanabe and T. Yokomori, eds., pp. 219230, 1999.
[32] S. Goldman and Y. Zhou, “Enhancing Supervised Learning with Unlabeled Data,” Proc. Int'l Conf. Machine Learning, pp. 327334, 2000.
[33] F.G. Cozman and I. Cohen, “Unlabeled Data Can Degrade Classification Performance of Generative Classifiers,” Proc. 15th Int'l Florida Artificial Intelligence Soc. Conf., pp. 327331, 2002.
[34] I. Cohen, “Semisupervised Learning of Classifiers with Application to HumanComputer Interaction,” PhD thesis, Univ. of Illinois at UrbanaChampaign, 2003.
[35] F.G. Cozman, I. Cohen, and M. Cirelo, “SemiSupervised Learning of Mixture Models,” Proc. Int'l Conf. Machine Learning (ICML), pp. 99106, 2003.
[36] A.P. Dempster, N.M. Laird, and D.B. Rubin, “Maximum Likelihood from Incomplete Data via the EM Algorithm,” J. Royal Statistical Soc., Series B, vol. 39, no. 1, pp. 138, 1977.
[37] H. White, “Maximum Likelihood Estimation of Misspecified Models,” Econometrica, vol. 50, pp. 125, Jan. 1982.
[38] F.G. Cozman and I. Cohen, “The Effect of Modeling Errors in SemiSupervised Learning of Mixture Models: How Unlabeled Data Can Degrade Performance of Generative Classifiers,” technical report, Univ. of Sao Paulo, http://www.poli.usp. br/p/fabio.cozman/Publications lul.ps.gz, 2003.
[39] S.W. Ahmed and P.A. Lachenbruch, “Discriminant Analysis when Scale Contamination Is Present in the Initial Sample,” Classification and Clustering, pp. 331353, New York: Academic Press, 1977.
[40] G.J. McLachlan, Discriminant Analysis and Statistical Pattern Recognition. New York: John Wiley and Sons, 1992
[41] J.H. Friedman, “On Bias, Variance, 0/1Loss, and the CurseofDimensionality,” Data Mining and Knowledge Discovery, vol. 1, no. 1, pp. 5577, 1997.
[42] M. Meila, “Learning with Mixture of Trees,” PhD thesis, Massachusetts Inst. of Technology, Boston, 1999.
[43] P. Spirtes, C. Glymour, and R. Scheines, Causation, Prediction, and Search, second ed. Cambridge, Mass.: MIT Press, 2000.
[44] J. Pearl, Causality: Models, Reasoning, and Inference. Cambridge, Mass.: Cambridge Univ. Press, 2000.
[45] J. Cheng, R. Greiner, J. Kelly, D.A. Bell, and W. Liu, “Learning Bayesian Networks from Data: An InformationTheory Based Approach,” Artificial Intelligence J., vol. 137, pp. 4390, May 2002.
[46] J. Cheng and R. Greiner, “Comparing Bayesian Network Classifiers,” Proc. Uncertainty in Artificial Intelligence (UAI), pp. 101108, 1999.
[47] T.V. Allen and R. Greiner, “A Model Selection Criteria for Learning Belief Nets: An Empirical Comparison,” Proc. Int'l Conf. Machine Learning (ICML), pp. 10471054, 2000.
[48] N. Friedman, “The Bayesian Structural EM Algorithm,” Proc. Uncertainty in Artificial Intelligence (UAI), pp. 129138, 1998.
[49] N. Metropolis, A.W. Rosenbluth, M.N. Rosenbluth, A.H. Teller, and E. Teller, “Equation of State Calculation by Fast Computing Machines,” J. Chemical Physics, vol. 21, pp. 10871092, 1953.
[50] D. Madigan and J. York, “Bayesian Graphical Models for Discrete Data,” Int'l Statistical Rev., vol. 63, no. 2, pp. 215232, 1995.
[51] B. Hajek, “Cooling Schedules for Optimal Annealing,” Math. Operational Research, vol. 13, pp. 311329, May 1988.
[52] D. Roth, “Learning in Natural Language,” Proc. Int'l Joint Conf. Artificial Intelligence, pp. 898904, 1999.
[53] P. Ekman, “Strong Evidence for Universals in Facial Expressions: A Reply to Russell's Mistaken Critique,” Psychological Bulletin, vol. 115, no. 2, pp. 268287, 1994.
[54] M. Pantic and L.J. M. Rothkrantz, “Automatic Analysis of Facial Expressions: The State of the Art,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 12, pp. 14241445, Dec. 2000.
[55] T. Kanade, J. Cohn, and Y. Tian, “Comprehensive Database for Facial Expression Analysis,” Proc. Automatic Face and Gesture Recognition (FG '00), pp. 4653, 2000.
[56] I. Cohen, N. Sebe, A. Garg, and T. S. Huang, “Facial Expression Recognition from Video Sequences,” Proc. Int'l Conf. Multimedia and Expo (ICME), pp. 121124, 2002.
[57] H. Tao and T.S. Huang, “Connected Vibrations: A Modal Analysis Approach to NonRigid Motion Tracking,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 735740, 1998.
[58] L.S. Chen, “Joint Processing of AudioVisual Information for the Recognition of Emotional Expressions in HumanComputer Interaction,” PhD thesis, Univ. of Illinois at UrbanaChampaign, 2000.
[59] M.H. Yang, D. Kriegman, and N. Ahuja, “Detecting Faces in Images: A Survey,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 1, pp. 3458, Jan. 2002.
[60] “MIT CBCL Face Database #1,” MIT Center for Biological and Computation Learning: http://www.ai.mit.edu/projectscbcl, 2002.
[61] K. Bennett and A. Demiriz, “SemiSupervised Support Vector Machines,” Proc. Neural Information and Processing Systems (NIPS), pp. 368374, 1998.
[62] A. Blum and T. Mitchell, “Combining Labeled and Unlabeled Data with CoTraining,” Proc. 11th Ann. Conf. Computational Learning Theory, pp. 92100, 1998.
[63] R. Ghani, “Combining Labeled and Unlabeled Data for Multiclass Text Categorization,” Proc. Int'l Conf. Machine Learning (ICML), pp. 187194, 2002.