
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
Haibo He, Edwardo A. Garcia, "Learning from Imbalanced Data," IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 9, pp. 12631284, September, 2009.  
BibTex  x  
@article{ 10.1109/TKDE.2008.239, author = {Haibo He and Edwardo A. Garcia}, title = {Learning from Imbalanced Data}, journal ={IEEE Transactions on Knowledge and Data Engineering}, volume = {21}, number = {9}, issn = {10414347}, year = {2009}, pages = {12631284}, doi = {http://doi.ieeecomputersociety.org/10.1109/TKDE.2008.239}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Knowledge and Data Engineering TI  Learning from Imbalanced Data IS  9 SN  10414347 SP1263 EP1284 EPD  12631284 A1  Haibo He, A1  Edwardo A. Garcia, PY  2009 KW  Imbalanced learning KW  classification KW  sampling methods KW  costsensitive learning KW  kernelbased learning KW  active learning KW  assessment metrics. VL  21 JA  IEEE Transactions on Knowledge and Data Engineering ER   
[1] “Learning from Imbalanced Data Sets,” Proc. Am. Assoc. for Artificial Intelligence (AAAI) Workshop, N. Japkowicz, ed., 2000, (Technical Report WS0005).
[2] “Workshop Learning from Imbalanced Data Sets II,” Proc. Int'l Conf. Machine Learning, N.V. Chawla, N. Japkowicz, and A. Kolcz, eds., 2003.
[3] N.V. Chawla, N. Japkowicz, and A. Kolcz, “Editorial: Special Issue on Learning from Imbalanced Data Sets,” ACM SIGKDD Explorations Newsletter, vol. 6, no. 1, pp. 16, 2004.
[4] H. He and X. Shen, “A Ranked Subspace Learning Method for Gene Expression Data Classification,” Proc. Int'l Conf. Artificial Intelligence, pp. 358364, 2007.
[5] M. Kubat, R.C. Holte, and S. Matwin, “Machine Learning for the Detection of Oil Spills in Satellite Radar Images,” Machine Learning, vol. 30, nos. 2/3, pp. 195215, 1998.
[6] R. Pearson, G. Goney, and J. Shwaber, “Imbalanced Clustering for Microarray TimeSeries,” Proc. Int'l Conf. Machine Learning, Workshop Learning from Imbalanced Data Sets II, 2003.
[7] Y. Sun, M.S. Kamel, and Y. Wang, “Boosting for Learning Multiple Classes with Imbalanced Class Distribution,” Proc. Int'l Conf. Data Mining, pp. 592602, 2006.
[8] N. Abe, B. Zadrozny, and J. Langford, “An Iterative Method for MultiClass CostSensitive Learning,” Proc. ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 311, 2004.
[9] K. Chen, B.L. Lu, and J. Kwok, “Efficient Classification of MultiLabel and Imbalanced Data Using MinMax Modular Classifiers,” Proc. World Congress on Computation Intelligence—Int'l Joint Conf. Neural Networks, pp. 17701775, 2006.
[10] Z.H. Zhou and X.Y. Liu, “On MultiClass CostSensitive Learning,” Proc. Nat'l Conf. Artificial Intelligence, pp. 567572, 2006.
[11] X.Y. Liu and Z.H. Zhou, “Training CostSensitive Neural Networks with Methods Addressing the Class Imbalance Problem,” IEEE Trans. Knowledge and Data Eng., vol. 18, no. 1, pp. 6377, Jan. 2006.
[12] C. Tan, D. Gilbert, and Y. Deville, “MultiClass Protein Fold Classification Using a New Ensemble Machine Learning Approach,” Genome Informatics, vol. 14, pp. 206217, 2003.
[13] N.V. Chawla, K.W. Bowyer, L.O. Hall, and W.P. Kegelmeyer, “SMOTE: Synthetic Minority OverSampling Technique,” J.Artificial Intelligence Research, vol. 16, pp. 321357, 2002.
[14] H. Guo and H.L. Viktor, “Learning from Imbalanced Data Sets with Boosting and Data Generation: The DataBoost IM Approach,” ACM SIGKDD Explorations Newsletter, vol. 6, no. 1, pp.3039, 2004.
[15] K. Woods, C. Doss, K. Bowyer, J. Solka, C. Priebe, and W. Kegelmeyer, “Comparative Evaluation of Pattern Recognition Techniques for Detection of Microcalcifications in Mammography,” Int'l J. Pattern Recognition and Artificial Intelligence, vol. 7, no. 6, pp. 14171436, 1993.
[16] R.B. Rao, S. Krishnan, and R.S. Niculescu, “Data Mining for Improved Cardiac Care,” ACM SIGKDD Explorations Newsletter, vol. 8, no. 1, pp. 310, 2006.
[17] P.K. Chan, W. Fan, A.L. Prodromidis, and S.J. Stolfo, “Distributed Data Mining in Credit Card Fraud Detection,” IEEE Intelligent Systems, vol. 14, no. 6, pp. 6774, Nov./Dec. 1999.
[18] P. Clifton, A. Damminda, and L. Vincent, “Minority Report in Fraud Detection: Classification of Skewed Data,” ACM SIGKDD Explorations Newsletter, vol. 6, no. 1, pp. 5059, 2004.
[19] P. Chan and S. Stolfo, “Toward Scalable Learning with NonUniform Class and Cost Distributions,” Proc. Int'l Conf. Knowledge Discovery and Data Mining, pp. 164168, 1998.
[20] G.M. Weiss, “Mining with Rarity: A Unifying Framework,” ACM SIGKDD Explorations Newsletter, vol. 6, no. 1, pp. 719, 2004.
[21] G.M. Weiss, “Mining Rare Cases,” Data Mining and Knowledge Discovery Handbook: A Complete Guide for Practitioners and Researchers, pp. 765776, Springer, 2005.
[22] G.E.A.P.A. Batista, R.C. Prati, and M.C. Monard, “A Study of the Behavior of Several Methods for Balancing Machine Learning Training Data,” ACM SIGKDD Explorations Newsletter, vol. 6, no. 1, pp.2029, 2004.
[23] N. Japkowicz and S. Stephen, “The Class Imbalance Problem: A Systematic Study,” Intelligent Data Analysis, vol. 6, no. 5, pp. 429449, 2002.
[24] G.M. Weiss and F. Provost, “Learning When Training Data Are Costly: The Effect of Class Distribution on Tree Induction,” J.Artificial Intelligence Research, vol. 19, pp. 315354, 2003.
[25] R.C. Holte, L. Acker, and B.W. Porter, “Concept Learning and the Problem of Small Disjuncts,” Proc. Int'l J. Conf. Artificial Intelligence, pp. 813818, 1989.
[26] J.R. Quinlan, “Induction of Decision Trees,” Machine Learning, vol. 1, no. 1, pp. 81106, 1986.
[27] T. Jo and N. Japkowicz, “Class Imbalances versus Small Disjuncts,” ACM SIGKDD Explorations Newsletter, vol. 6, no. 1, pp. 4049, 2004.
[28] N. Japkowicz, “Class Imbalances: Are We Focusing on the Right Issue?” Proc. Int'l Conf. Machine Learning, Workshop Learning from Imbalanced Data Sets II, 2003.
[29] R.C. Prati, G.E.A.P.A. Batista, and M.C. Monard, “Class Imbalances versus Class Overlapping: An Analysis of a Learning System Behavior,” Proc. Mexican Int'l Conf. Artificial Intelligence, pp. 312321, 2004.
[30] S.J. Raudys and A.K. Jain, “Small Sample Size Effects in Statistical Pattern Recognition: Recommendations for Practitioners,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 13, no. 3, pp.252264, Mar. 1991.
[31] R. Caruana, “Learning from Imbalanced Data: Rank Metrics and Extra Tasks,” Proc. Am. Assoc. for Artificial Intelligence (AAAI) Conf., pp. 5157, 2000 (AAAI Technical Report WS0005).
[32] W.H. Yang, D.Q. Dai, and H. Yan, “Feature Extraction Uncorrelated Discriminant Analysis for HighDimensional Data,” IEEE Trans. Knowledge and Data Eng., vol. 20, no. 5, pp. 601614, May 2008.
[33] N.V. Chawla, “C4.5 and Imbalanced Data Sets: Investigating the Effect of Sampling Method, Probabilistic Estimate, and Decision Tree Structure,” Proc. Int'l Conf. Machine Learning, Workshop Learning from Imbalanced Data Sets II, 2003.
[34] T.M. Mitchell, Machine Learning. McGraw Hill, 1997.
[35] G.M. Weiss and F. Provost, “The Effect of Class Distribution on Classifier Learning: An Empirical Study,” Technical Report MLTR43, Dept. of Computer Science, Rutgers Univ., 2001.
[36] J. Laurikkala, “Improving Identification of Difficult Small Classes by Balancing Class Distribution,” Proc. Conf. AI in Medicine in Europe: Artificial Intelligence Medicine, pp. 6366, 2001.
[37] A. Estabrooks, T. Jo, and N. Japkowicz, “A Multiple Resampling Method for Learning from Imbalanced Data Sets,” Computational Intelligence, vol. 20, pp. 1836, 2004.
[38] D. Mease, A.J. Wyner, and A. Buja, “Boosted Classification Trees and Class Probability/Quantile Estimation,” J. Machine Learning Research, vol. 8, pp. 409439, 2007.
[39] C. Drummond and R.C. Holte, “C4.5, Class Imbalance, and Cost Sensitivity: Why Under Sampling Beats OverSampling,” Proc. Int'l Conf. Machine Learning, Workshop Learning from Imbalanced Data Sets II, 2003.
[40] X.Y. Liu, J. Wu, and Z.H. Zhou, “Exploratory Under Sampling for Class Imbalance Learning,” Proc. Int'l Conf. Data Mining, pp. 965969, 2006.
[41] J. Zhang and I. Mani, “KNN Approach to Unbalanced Data Distributions: A Case Study Involving Information Extraction,” Proc. Int'l Conf. Machine Learning (ICML '2003), Workshop Learning from Imbalanced Data Sets, 2003.
[42] M. Kubat and S. Matwin, “Addressing the Curse of Imbalanced Training Sets: OneSided Selection,” Proc. Int'l Conf. Machine Learning, pp. 179186, 1997.
[43] B.X. Wang and N. Japkowicz, “Imbalanced Data Set Learning with Synthetic Samples,” Proc. IRIS Machine Learning Workshop, 2004.
[44] H. Han, W.Y. Wang, and B.H. Mao, “BorderlineSMOTE: A New OverSampling Method in Imbalanced Data Sets Learning,” Proc. Int'l Conf. Intelligent Computing, pp. 878887, 2005.
[45] H. He, Y. Bai, E.A. Garcia, and S. Li, “ADASYN: Adaptive Synthetic Sampling Approach for Imbalanced Learning,” Proc. Int'l J. Conf. Neural Networks, pp. 13221328, 2008.
[46] I. Tomek, “Two Modifications of CNN,” IEEE Trans. System, Man, Cybernetics, vol. 6, no. 11, pp. 769772, Nov. 1976.
[47] N.V. Chawla, A. Lazarevic, L.O. Hall, and K.W. Bowyer, “SMOTEBoost: Improving Prediction of the Minority Class in Boosting,” Proc. Seventh European Conf. Principles and Practice of Knowledge Discovery in Databases, pp. 107119, 2003.
[48] H. Guo and H.L. Viktor, “Boosting with Data Generation: Improving the Classification of Hard to Learn Examples,” Proc. Int'l Conf. Innovations Applied Artificial Intelligence, pp. 10821091, 2004.
[49] C. Elkan, “The Foundations of CostSensitive Learning,” Proc. Int'l Joint Conf. Artificial Intelligence, pp. 973978, 2001.
[50] K.M. Ting, “An InstanceWeighting Method to Induce CostSensitive Trees,” IEEE Trans. Knowledge and Data Eng., vol. 14, no. 3, pp. 659665, May/June 2002.
[51] M.A. Maloof, “Learning When Data Sets Are Imbalanced and When Costs Are Unequal and Unknown,” Proc. Int'l Conf. Machine Learning, Workshop Learning from Imbalanced Data Sets II, 2003.
[52] K. McCarthy, B. Zabar, and G.M. Weiss, “Does CostSensitive Learning Beat Sampling for Classifying Rare Classes?” Proc. Int'l Workshop UtilityBased Data Mining, pp. 6977, 2005.
[53] X.Y. Liu and Z.H. Zhou, “The Influence of Class Imbalance on CostSensitive Learning: An Empirical Study,” Proc. Int'l Conf. Data Mining, pp. 970974, 2006.
[54] P. Domingos, “MetaCost: A General Method for Making Classifiers CostSensitive,” Proc. Int'l Conf. Knowledge Discovery and Data Mining, pp. 155164, 1999.
[55] B. Zadrozny, J. Langford, and N. Abe, “CostSensitive Learning by CostProportionate Example Weighting,” Proc. Int'l Conf. Data Mining, pp. 435442, 2003.
[56] Y. Freund and R.E. Schapire, “Experiments with a New Boosting Algorithm,” Proc. Int'l Conf. Machine Learning, pp. 148156, 1996.
[57] Y. Freund and R.E. Schapire, “A DecisionTheoretic Generalization of OnLine Learning and an Application to Boosting,” J.Computer and System Sciences, vol. 55, no. 1, pp. 119139, 1997.
[58] Y. Sun, M.S. Kamel, A.K.C. Wong, and Y. Wang, “CostSensitive Boosting for Classification of Imbalanced Data,” Pattern Recognition, vol. 40, no. 12, pp. 33583378, 2007.
[59] W. Fan, S.J. Stolfo, J. Zhang, and P.K. Chan, “AdaCost: Misclassification CostSensitive Boosting,” Proc. Int'l Conf. Machine Learning, pp. 97105, 1999.
[60] K.M. Ting, “A Comparative Study of CostSensitive Boosting Algorithms,” Proc. Int'l Conf. Machine Learning, pp. 983990, 2000.
[61] M. Maloof, P. Langley, S. Sage, and T. Binford, “Learning to Detect Rooftops in Aerial Images,” Proc. Image Understanding Workshop, pp. 835845, 1997.
[62] L. Breiman, J. Friedman, R. Olshen, and C. Stone, Classification and Regression Trees. Chapman & Hall/CRC Press, 1984.
[63] C. Drummond and R.C. Holte, “Exploiting the Cost (In)Sensitivity of Decision Tree Splitting Criteria,” Proc. Int'l Conf. Machine Learning, pp. 239246, 2000.
[64] S. Haykin, Neural Networks: A Comprehensive Foundation, second ed. PrenticeHall, 1999.
[65] M.Z. Kukar and I. Kononenko, “CostSensitive Learning with Neural Networks,” Proc. European Conf. Artificial Intelligence, pp.445449, 1998.
[66] P. Domingos and M. Pazzani, “Beyond Independence: Conditions for the Optimality of the Simple Bayesian Classifier,” Proc. Int'l Conf. Machine Learning, pp. 105112, 1996.
[67] G.R.I. Webb and M.J. Pazzani, “Adjusted Probability Naive Bayesian Induction,” Proc. Australian Joint Conf. Artificial Intelligence, pp. 285295, 1998.
[68] R. Kohavi and D. Wolpert, “Bias Plus Variance Decomposition for ZeroOne Loss Functions,” Proc. Int'l Conf. Machine Learning, 1996.
[69] J. Gama, “Iterative Bayes,” Theoretical Computer Science, vol. 292, no. 2, pp. 417430, 2003.
[70] G. Fumera and F. Roli, “Support Vector Machines with Embedded Reject Option,” Proc. Int'l Workshop Pattern Recognition with Support Vector Machines, pp. 6882, 2002.
[71] J.C. Platt, “Fast Training of Support Vector Machines Using Sequential Minimal Optimization,” Advances in Kernel Methods: Support Vector Learning, pp. 185208, MIT Press, 1999.
[72] J.T. Kwok, “Moderating the Outputs of Support Vector Machine Classifiers,” IEEE Trans. Neural Networks, vol. 10, no. 5, pp. 10181031, Sept. 1999.
[73] V.N. Vapnik, The Nature of Statistical Learning Theory. Springer, 1995.
[74] B. Raskutti and A. Kowalczyk, “Extreme ReBalancing for SVMs: A Case Study,” ACM SIGKDD Explorations Newsletter, vol. 6, no. 1, pp. 6069, 2004.
[75] R. Akbani, S. Kwek, and N. Japkowicz, “Applying Support Vector Machines to Imbalanced Data Sets,” Lecture Notes in Computer Science, vol. 3201, pp. 3950, 2004.
[76] G. Wu and E. Chang, “ClassBoundary Alignment for Imbalanced Data Set Learning,” Proc. Int'l Conf. Data Mining (ICDM '03), Workshop Learning from Imbalanced Data Sets II, 2003.
[77] F. Vilarino, P. Spyridonos, P. Radeva, and J. Vitria, “Experiments with SVM and Stratified Sampling with an Imbalanced Problem: Detection of Intestinal Contractions,” Lecture Notes in Computer Science, vol. 3687, pp. 783791, 2005.
[78] P. Kang and S. Cho, “EUS SVMs: Ensemble of Under sampled SVMs for Data Imbalance Problems,” Lecture Notes in Computer Science, vol. 4232, pp. 837846, 2006.
[79] Y. Liu, A. An, and X. Huang, “Boosting Prediction Accuracy on Imbalanced Data Sets with SVM Ensembles,” Lecture Notes in Artificial Intelligence, vol. 3918, pp. 107118, 2006.
[80] B.X. Wang and N. Japkowicz, “Boosting Support Vector Machines for Imbalanced Data Sets,” Lecture Notes in Artificial Intelligence, vol. 4994, pp. 3847, 2008.
[81] Y. Tang and Y.Q. Zhang, “Granular SVM with Repetitive Undersampling for Highly Imbalanced Protein Homology Prediction,” Proc. Int'l Conf. Granular Computing, pp. 457 460, 2006.
[82] Y.C. Tang, B. Jin, and Y.Q. Zhang, “Granular Support Vector Machines with Association Rules Mining for Protein Homology Prediction,” Artificial Intelligence in Medicine, special issue on computational intelligence techniques in bioinformatics, vol. 35, nos. 1/2, pp. 121134, 2005.
[83] Y.C. Tang, B. Jin, Y.Q. Zhang, H. Fang, and B. Wang, “Granular Support Vector Machines Using Linear Decision Hyperplanes for Fast Medical Binary Classification,” Proc. Int'l Conf. Fuzzy Systems, pp. 138142, 2005.
[84] Y.C. Tang, Y.Q. Zhang, Z. Huang, X.T. Hu, and Y. Zhao, “Granular SVMRFE Feature Selection Algorithm for Reliable CancerRelated Gene Subsets Extraction on Microarray Gene Expression Data,” Proc. IEEE Symp. Bioinformatics and Bioeng., pp.290293, 2005.
[85] X. Hong, S. Chen, and C.J. Harris, “A KernelBased TwoClass Classifier for Imbalanced Data Sets,” IEEE Trans. Neural Networks, vol. 18, no. 1, pp. 2841, Jan. 2007.
[86] G. Wu and E.Y. Chang, “Aligning Boundary in Kernel Space for Learning Imbalanced Data Set,” Proc. Int'l Conf. Data Mining, pp.265272, 2004.
[87] G. Wu and E.Y. Chang, “KBA: Kernel Boundary Alignment Considering Imbalanced Data Distribution,” IEEE Trans. Knowledge and Data Eng., vol. 17, no. 6, pp. 786795, June 2005.
[88] G. Wu and E.Y. Chang, “Adaptive FeatureSpace Conformal Transformation for ImbalancedData Learning,” Proc. Int'l Conf. Machine Learning, pp. 816823, 2003.
[89] Y.H. Liu and Y.T. Chen, “Face Recognition Using Total MarginBased Adaptive Fuzzy Support Vector Machines,” IEEE Trans. Neural Networks, vol. 18, no. 1, pp. 178192, Jan. 2007.
[90] Y.H. Liu and Y.T. Chen, “Total Margin Based Adaptive Fuzzy Support Vector Machines for Multiview Face Recognition,” Proc. Int'l Conf. Systems, Man and Cybernetics, pp. 17041711, 2005.
[91] G. Fung and O.L. Mangasarian, “Multicategory Proximal Support Vector Machine Classifiers,” Machine Learning, vol. 59, nos. 1/2, pp. 7797, 2005.
[92] J. Yuan, J. Li, and B. Zhang, “Learning Concepts from Large Scale Imbalanced Data Sets Using Support Cluster Machines,” Proc. Int'l Conf. Multimedia, pp. 441450, 2006.
[93] A.K. Qin and P.N. Suganthan, “Kernel Neural Gas Algorithms with Application to Cluster Analysis,” Proc. Int'l Conf. Pattern Recognition, 2004.
[94] X.P. Yu and X.G. Yu, “Novel Text Classification Based on KNearest Neighbor,” Proc. Int'l Conf. Machine Learning Cybernetics, pp. 34253430, 2007.
[95] P. Li, K.L. Chan, and W. Fang, “Hybrid Kernel Machine Ensemble for Imbalanced Data Sets,” Proc. Int'l Conf. Pattern Recognition, pp.11081111, 2006.
[96] A. Tashk, R. Bayesteh, and K. Faez, “Boosted Bayesian Kernel Classifier Method for Face Detection,” Proc. Int'l Conf. Natural Computation, pp. 533537, 2007.
[97] N. Abe, “Invited Talk: Sampling Approaches to Learning from Imbalanced Data Sets: Active Learning, Cost Sensitive Learning and Deyond,” Proc. Int'l Conf. Machine Learning, Workshop Learning from Imbalanced Data Sets II, 2003.
[98] S. Ertekin, J. Huang, L. Bottou, and L. Giles, “Learning on the Border: Active Learning in Imbalanced Data Classification,” Proc. ACM Conf. Information and Knowledge Management, pp. 127136, 2007.
[99] S. Ertekin, J. Huang, and C.L. Giles, “Active Learning for Class Imbalance Problem,” Proc. Int'l SIGIR Conf. Research and Development in Information Retrieval, pp. 823824, 2007.
[100] F. Provost, “Machine Learning from Imbalanced Data Sets 101,” Proc. Learning from Imbalanced Data Sets: Papers from the Am. Assoc. for Artificial Intelligence Workshop, 2000 (Technical Report WS0005).
[101] Bordes, S. Ertekin, J. Weston, and L. Bottou, “Fast Kernel Classifiers with Online and Active Learning,” J. Machine Learning Research, vol. 6, pp. 15791619, 2005.
[102] J. Zhu and E. Hovy, “Active Learning for Word Sense Disambiguation with Methods for Addressing the Class Imbalance Problem,” Proc. Joint Conf. Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 783790, 2007.
[103] J. Doucette and M.I. Heywood, “GP Classification under Imbalanced Data Sets: Active SubSampling AUC Approximation,” Lecture Notes in Computer Science, vol. 4971, pp. 266277, 2008.
[104] B. Scholkopt, J.C. Platt, J. ShaweTaylor, A.J. Smola, and R.C. Williamson, “Estimating the Support of a HighDimensional Distribution,” Neural Computation, vol. 13, pp. 14431471, 2001.
[105] L.M. Manevitz and M. Yousef, “OneClass SVMs for Document Classification,” J. Machine Learning Research, vol. 2, pp. 139154, 2001.
[106] L. Zhuang and H. Dai, “Parameter Estimation of OneClass SVM on Imbalance Text Classification,” Lecture Notes in Artificial Intelligence, vol. 4013, pp. 538549, 2006.
[107] H.J. Lee and S. Cho, “The Novelty Detection Approach for Difference Degrees of Class Imbalance,” Lecture Notes in Computer Science, vol. 4233, pp. 2130, 2006.
[108] L. Zhuang and H. Dai, “Parameter Optimization of KernelBased OneClass Classifier on Imbalance Text Learning,” Lecture Notes in Artificial Intelligence, vol. 4099, pp. 434443, 2006.
[109] N. Japkowicz, “Supervised versus Unsupervised BinaryLearning by Feedforward Neural Networks,” Machine Learning, vol. 42, pp.97122, 2001.
[110] L. Manevitz and M. Yousef, “OneClass Document Classification via Neural Networks,” Neurocomputing, vol. 70, pp. 14661481, 2007.
[111] N. Japkowicz, “Learning from Imbalanced Data Sets: A Comparison of Various Strategies,” Proc. Am. Assoc. for Artificial Intelligence (AAAI) Workshop Learning from Imbalanced Data Sets, pp. 1015, 2000 (Technical Report WS0005).
[112] N. Japkowicz, C. Myers, and M. Gluck, “A Novelty Detection Approach to Classification,” Proc. Joint Conf. Artificial Intelligence, pp. 518523, 1995.
[113] C.T. Su and Y.H. Hsiao, “An Evaluation of the Robustness of MTS for Imbalanced Data,” IEEE Trans. Knowledge and Data Eng., vol. 19, no. 10, pp. 13211332, Oct. 2007.
[114] G. Taguchi, S. Chowdhury, and Y. Wu, The MahalanobisTaguchi System. McGrawHill, 2001.
[115] G. Taguchi and R. Jugulum, The MahalanobisTaguchi Strategy. John Wiley & Sons, 2002.
[116] M.V. Joshi, V. Kumar, and R.C. Agarwal, “Evaluating Boosting Algorithms to Classify Rare Classes: Comparison and Improvements,” Proc. Int'l Conf. Data Mining, pp. 257264, 2001.
[117] F.J. Provost and T. Fawcett, “Analysis and Visualization of Classifier Performance: Comparison under Imprecise Class and Cost Distributions,” Proc. Int'l Conf. Knowledge Discovery and Data Mining, pp. 4348, 1997.
[118] F.J. Provost, T. Fawcett, and R. Kohavi, “The Case against Accuracy Estimation for Comparing Induction Algorithms,” Proc. Int'l Conf. Machine Learning, pp. 445453, 1998.
[119] T. Fawcett, “ROC Graphs: Notes and Practical Considerations for Data Mining Researchers,” Technical Report HPL20034, HP Labs, 2003.
[120] T. Fawcett, “An Introduction to ROC Analysis,” Pattern Recognition Letters, vol. 27, no. 8, pp. 861874, 2006.
[121] F. Provost and P. Domingos, “WellTrained Pets: Improving Probability Estimation Trees,” CeDER Working Paper: IS0004, Stern School of Business, New York Univ., 2000.
[122] T. Fawcett, “Using Rule Sets to Maximize ROC Performance,” Proc. Int'l Conf. Data Mining, pp. 131138, 2001.
[123] J. Davis and M. Goadrich, “The Relationship between PrecisionRecall and ROC Curves,” Proc. Int'l Conf. Machine Learning, pp.233240, 2006.
[124] R. Bunescu, R. Ge, R. Kate, E. Marcotte, R. Mooney, A. Ramani, and Y. Wong, “Comparative Experiments on Learning Information Extractors for Proteins and Their Interactions,” Artificial Intelligence in Medicine, vol. 33, pp. 139155, 2005.
[125] J. Davis, E. Burnside, I. Dutra, D. Page, R. Ramakrishnan, V.S. Costa, and J. Shavlik, “View Learning for Statistical Relational Learning: With an Application to Mammography,” Proc. Int'l Joint Conf. Artificial Intelligence, pp. 677683, 2005.
[126] P. Singla and P. Domingos, “Discriminative Training of Markov Logic Networks,” Proc. Nat'l Conf. Artificial Intelligence, pp. 868873, 2005.
[127] T. Landgrebe, P. Paclik, R. Duin, and A.P. Bradley, “PrecisionRecall Operating Characteristic (PROC) Curves in Imprecise Environments,” Proc. Int'l Conf. Pattern Recognition, pp. 123127, 2006.
[128] R.C. Holte and C. Drummond, “Cost Curves: An Improved Method for Visualizing Classifier Performance,” Machine Learning, vol. 65, no. 1, pp. 95130, 2006.
[129] R.C. Holte and C. Drummond, “CostSensitive Classifier Evaluation,” Proc. Int'l Workshop UtilityBased Data Mining, pp. 39, 2005.
[130] R.C. Holte and C. Drummond, “Explicitly Representing Expected Cost: An Alternative to ROC Representation,” Proc. Int'l Conf. Knowledge Discovery and Data Mining, pp. 198207, 2000.
[131] D.J. Hand and R.J. Till, “A Simple Generalization of the Area under the ROC Curve to Multiple Class Classification Problems,” Machine Learning, vol. 45, no. 2, pp. 171186, 2001.
[132] UC Irvine Machine Learning Repository, http://dx.doi.org/10.1007/s1011500801444http:/ /archive.ics.uci. eduml/, 2009.
[133] NIST Scientific and Technical Databases, http://nist.gov/srdonline.htm, 2009.
[134] H. He and S. Chen, “IMORL: Incremental Multiple Objects Recognition Localization,” IEEE Trans. Neural Networks, vol. 19, no. 10, pp. 17271738, Oct. 2008.
[135] X. Zhu, “SemiSupervised Learning Literature Survey,” Technical Report TR1530, Univ. of WisconsinMadson, 2007.
[136] A. Blum and T. Mitchell, “Combining Labeled and Unlabeled Data with CoTraining,” Proc. Workshop Computational Learning Theory, pp. 92100, 1998.
[137] T.M. Mitchell, “The Role of Unlabeled Data in Supervised Learning,” Proc. Int'l Colloquium on Cognitive Science, 1999.
[138] C. Rosenberg, M. Hebert, and H. Schneiderman, “SemiSupervised SelfTraining of Object Detection Models,” Proc. IEEE Workshops Application of Computer Vision, pp. 2936, 2005.
[139] M. Wang, X.S. Hua, L.R. Dai, and Y. Song, “Enhanced SemiSupervised Learning for Automatic Video Annotation,” Proc. Int'l Conf. Multimedia and Expo, pp. 14851488, 2006.
[140] K.P. Bennett and A. Demiriz, “SemiSupervised Support Vector Machines,” Proc. Conf. Neural Information Processing Systems, pp.368374, 1998.
[141] V. Sindhwani and S.S. Keerthi, “Large Scale SemiSupervised Linear SVMs,” Proc. Int'l SIGIR Conf. Research and Development in Information Retrieval, pp. 477484, 2006.
[142] A. Blum and S. Chawla, “Learning from Labeled and Unlabeled Data Using Graph Mincuts,” Proc. Int'l Conf. Machine Learning, pp.1926, 2001.
[143] D. Zhou, B. Scholkopf, and T. Hofmann, “SemiSupervised Learning on Directed Graphs,” Proc. Conf. Neural Information Processing Systems, pp. 16331640, 2004.
[144] A. Fujino, N. Ueda, and K. Saito, “A Hybrid Generative/Discriminative Approach to SemiSupervised Classifier Design,” Proc. Nat'l Conf. Artificial Intelligence, pp. 764769, 2005.
[145] D.J. Miller and H.S. Uyar, “A Mixture of Experts Classifier with Learning Based on Both Labeled and Unlabelled Data,” Proc. Ann. Conf. Neural Information Processing Systems, pp. 571577, 1996.