Publication PrePrints Abstract - Ensembles of alpha-Trees for Imbalanced Classification Problems
Ensembles of alpha-Trees for Imbalanced Classification Problems
PrePrint
ISSN: 1041-4347
 ASCII Text x Yubin Park, Joydeep Ghosh, "Ensembles of alpha-Trees for Imbalanced Classification Problems," IEEE Transactions on Knowledge and Data Engineering, vol. 99, no. 1, pp. 1, , 5555.
 BibTex x @article{ 10.1109/TKDE.2012.255,author = {Yubin Park and Joydeep Ghosh},title = {Ensembles of alpha-Trees for Imbalanced Classification Problems},journal ={IEEE Transactions on Knowledge and Data Engineering},volume = {99},number = {1},issn = {1041-4347},year = {5555},pages = {1},doi = {http://doi.ieeecomputersociety.org/10.1109/TKDE.2012.255},publisher = {IEEE Computer Society},address = {Los Alamitos, CA, USA},}
 RefWorks Procite/RefMan/Endnote x TY - JOURJO - IEEE Transactions on Knowledge and Data EngineeringTI - Ensembles of alpha-Trees for Imbalanced Classification ProblemsIS - 1SN - 1041-4347SPEPEPD - 1A1 - Yubin Park, A1 - Joydeep Ghosh, PY - 5555KW - and association rulesKW - Computing MethodologiesKW - Pattern RecognitionKW - GeneralKW - Information Technology and SystemsKW - Database ManagementKW - Database ApplicationsKW - Data miningKW - Information Technology and SystemsKW - Database ManagementKW - Database ApplicationsKW - ClusteringKW - classificationVL - 99JA - IEEE Transactions on Knowledge and Data EngineeringER -
Yubin Park, The University of Texas at Austin, Austin
Joydeep Ghosh, UThe niversity of Texas at Austin, Austin
This paper introduces two kinds of decision tree ensembles for imbalanced classification problems, extensively utilizing properties of $\alpha$-divergence. First, a novel splitting criterion based on $\alpha$-divergence is shown to generalize several well-known splitting criteria such as those used in C4.5 and CART. When the $\alpha$-divergence splitting criterion is applied to imbalanced data, one can obtain decision trees that tend to be less correlated ($\alpha$-diversification) by varying the value of $\alpha$. This increased diversity in an ensemble of such trees improves AUROC values across a range of minority class priors. The second ensemble uses the same alpha trees as base classifiers, but uses a lift-aware stopping criterion during tree growth. The resultant ensemble produces a set of interpretable rules that provide higher lift values for a given coverage, a property that is much desirable in applications such as direct marketing. Experimental results across many class-imbalanced datasets, including BRFSS, and MIMIC datasets from the medical community and several sets from UCI and KEEL, are provided to highlight the effectiveness of the proposed ensembles over a wide range of data distributions and of class imbalance.
Index Terms:
and association rules,Computing Methodologies,Pattern Recognition,General,Information Technology and Systems,Database Management,Database Applications,Data mining,Information Technology and Systems,Database Management,Database Applications,Clustering,classification
Citation:
Yubin Park, Joydeep Ghosh, "Ensembles of alpha-Trees for Imbalanced Classification Problems," IEEE Transactions on Knowledge and Data Engineering, 31 Dec. 2012. IEEE computer Society Digital Library. IEEE Computer Society, <http://doi.ieeecomputersociety.org/10.1109/TKDE.2012.255>