
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
David J. Miller, John Browning, "A Mixture Model and EMBased Algorithm for Class Discovery, Robust Classification, and Outlier Rejection in Mixed Labeled/Unlabeled Data Sets," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 11, pp. 14681483, November, 2003.  
BibTex  x  
@article{ 10.1109/TPAMI.2003.1240120, author = {David J. Miller and John Browning}, title = {A Mixture Model and EMBased Algorithm for Class Discovery, Robust Classification, and Outlier Rejection in Mixed Labeled/Unlabeled Data Sets}, journal ={IEEE Transactions on Pattern Analysis and Machine Intelligence}, volume = {25}, number = {11}, issn = {01628828}, year = {2003}, pages = {14681483}, doi = {http://doi.ieeecomputersociety.org/10.1109/TPAMI.2003.1240120}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Pattern Analysis and Machine Intelligence TI  A Mixture Model and EMBased Algorithm for Class Discovery, Robust Classification, and Outlier Rejection in Mixed Labeled/Unlabeled Data Sets IS  11 SN  01628828 SP1468 EP1483 EPD  14681483 A1  David J. Miller, A1  John Browning, PY  2003 KW  Class discovery KW  labeled and unlabeled data KW  outlier detection KW  sample rejection KW  mixture models KW  EM algorithm KW  text categorization. VL  25 JA  IEEE Transactions on Pattern Analysis and Machine Intelligence ER   
Abstract—Several authors have shown that, when labeled data are scarce, improved classifiers can be built by augmenting the training set with a large set of unlabeled examples and then performing suitable learning. These works assume each unlabeled sample originates from one of the (known) classes. Here, we assume each unlabeled sample comes from either a known
[1] A. BenDor, N. Friedman, and Z. Yakhini, Class Discovery in Gene Expression Data Proc. Fifth Ann. Int'l Conf. Computational Biology, pp. 3138, 2001.
[2] J. Besag, On the Statistical Analysis of Dirty Pictures J. Royal Statistical Soc. B., vol. 48, pp. 259302, 1986.
[3] A. Blum and T. Mitchell, Combined Labeled and Unlabeled Data with CoTraining Proc. Conf. Computational Learning Theory, pp. 92100, 1998.
[4] J.S. Bridle, Probabilistic Interpretation of Feedforward Classification Network Outputs, with Relationships to Statistical Pattern Recognition Neurocomputing: Algorithms, Architectures, and Applications, New York: SpringerVerlag, pp. 227236, 1990.
[5] V. Castelli and T. Cover, On the Exponential Value of Labeled Samples Pattern Recognition Letters, vol. 16, pp. 105111, 1995.
[6] I. Chang and M. Loew, Pattern Recognition with New Class Discovery Proc. Conf. Computer Vision and Pattern Recognition, pp. 438443, 1991.
[7] F.G. Cozman and I. Cohen, Unlabeled Data can Degrade Classification Performance of Generative Classifiers Hewlett Packard technical report, pp. 116, 2001.
[8] R. Dave, Characterization and Detection of Noise in Clustering Pattern Recognition Letters, vol. 12, pp. 657664, 1991.
[9] A. Dempster, N. Laird, and D. Rubin, MaximumLikelihood from Incomplete Data via the EM Algorithm J. Royal Statistical Soc. B, vol. 39, pp. 138, 1977.
[10] S. Dumais et al., "Inductive Learning Algorithms and Representations for Text Categorization, to be published in Proc. Conf. Information and Knowledge Management, 1998; .
[11] J.A. Fessler and A.O. Hero, SpaceAlternating Generalized ExpectationMaximization Algorithm IEEE Trans. Signal Processing, vol. 42, pp. 26642677, 1994.
[12] M.A.T. Figueiredo and A.K. Jain, Unsupervised Learning of Finite Mixture Models IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, pp. 381396, 2002.
[13] P. Hall and D.M. Titterington, The Use of Uncategorized Data to Improve the Performance of a Nonparametric Estimator of a Mixture Density J. Royal Statistical Soc. B, vol. 47, pp. 155163, 1985.
[14] S. Haykin, Neural Network—A Comprehensive Foundation, second ed. Prentice Hall, 1999.
[15] P.J. Huber, Robust Statistics. New York: Wiley, 1981.
[16] M. Inoue and N. Ueda, HMMs for Both Labeled and Unlabeled Time Series Data Proc. IEEE Workshop Neural Networks for Signal Processing, pp. 93102, 2001.
[17] R.A. Jacobs, M.I. Jordan, S.J. Nowlan, and G.E. Hinton, Adaptive Mixtures of Local Experts Neural Computation, pp. 7987, 1991.
[18] A.K. Jain and R.C. Dubes, Algorithms for Clustering Data. Englewood Cliffs, N.J.: Prentice Hall, 1988.
[19] B. Jeong and D. Landgrebe, Partially Supervised Classification Using Weighted Unsupervised Clustering IEEE Trans. Geoscience and Remote Sensing, pp. 10731079, 1999.
[20] T. Joachims, Transductive Inference for Text Classification Using Support Vector Machines Proc. 14th Conf. Uncertainty in Artificial Intelligence, pp. 200209, 1999.
[21] J. Larsen, L.K. Hansen, T. Christiansen, and T. Kolenda, Webmining: Learning from the World Wide Web Computational Statistics and Data Analysis, vol. 38, pp. 517532, 2002.
[22] A. McCallum and K. Nigam, A Comparison of Event Models for Naive Bayes Text Classification Proc. AAAI Workshop Learning for Text Categorization, pp. 4148, 1998.
[23] G. McLachlan and D. Peel, Finite Mixture Models. New York: John Wiley and Sons, 2000.
[24] X.L. Meng and D. van Dyk, The EM Algorithm An Old FolkSong Sung to a Fast New Tune J. Royal Statistical Soc. B, vol. 59, no. 3, pp. 511567, 1997.
[25] D.J. Miller and H. Uyar, A Mixture of Experts Classifier with Learning Based on Both Labelled and Unlabelled Data Proc. Neural Information Processing Systems Conf., vol. 9, pp. 571577, 1997.
[26] D.J. Miller and H. Uyar, Combined Learning and Use for a Mixture Model Equivalent to the RBF Classifier Neural Computation, vol. 10, pp. 281294, 1998.
[27] D.J. Miller and H. Uyar, A Generalized Gaussian Mixture Classifier with Learning Based on Both Labelled and Unlabelled Data Proc. Int'l Conf. Information Sciences and Systems, pp. 783787, 1996.
[28] K. Nigam, A. McCallum, S. Thrun, and T. Mitchell, Text Classification from Labeled and Unlabeled Documents Using EM Machine Learning, pp. 134, 2000.
[29] P. Pudil, J. Novovicova, S. Blaha, and J. Kittler, Multistage Pattern Recognition with Reject Option Proc. Int'l Conf. Pattern Recognition Methodology and Systems, pp. 9295, 1992.
[30] L.R. Rabiner and B.H. Juang, Fundamentals of Speech Recognition, Prentice Hall, Upper Saddle River, N.J., 1993.
[31] J. Rissanen, Modelling by Shortest Data Description Automatica, vol. 14, pp. 465471, 1978.
[32] N. Roy and A. McCallum, Toward Optimal Active Learning through Sampling Estimation of Error Reduction Proc. Int'l Conf. Machine Learning, pp. 441448, 2001.
[33] B. Shashahani and D. Landgrebe, The Effect of Unlabeled Samples in Reducing the Small Sample Size Problem and Mitigating the Hughes Phenomenon IEEE Trans. Geoscience and Remote Sensing, vol. 32, pp. 10871095, 1994.
[34] A. Sierra and F. Corbacho, Reclassification as Supervised Clustering Neural Computation, vol. 12, pp. 25372546, 2000.
[35] S. Tajudin and D. Landgrebe, Robust Parameter Estimation for Mixture Model IEEE Trans. Geoscience and Remote Sensing, vol. 38, pp. 439445, 2000.
[36] N. Ueda and R. Nakano, Deterministic Annealing EM Algorithm Neural Networks, vol. 11, pp. 271282, 1998.
[37] N. Ueda, R. Nakano, Z. Ghahramani, and G. Hinton, Split and Merge EM Algorithm for Improving Gaussian Mixture Density Estimates Proc. IEEE Workshop Neural Networks for Signal Processing, pp. 274283, 1998.
[38] L. Xu, M.I. Jordan, and G.E. Hinton, An Alternative Model for Mixtures of Experts Proc. Neural Information Processing Systems Conf., vol. 7, pp. 633640, 1995.