loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
First IEEE International Conference on Data Mining (ICDM'01)
Comparisons of Classification Methods for Screening Potential Compounds
San Jose, California
November 29-December 02
ISBN: 0-7695-1119-8
We compare a number of data mining and statistical methods on the drug design problem of modeling molecular structure-activity relationships. The relationships can be use to identify active compounds base on their chemical structures from a large inventory of chemical compounds. The data set of this application has a highly skewed class distribution, in which only 2%of the compounds are considered active. We apply a number of classification methods to this extremely imbalance data set and propose to use different performance measures to evaluate these methods. We report our findings on the characteristics of the performance measures, the effect of using pruning techniques in this application and a comparison of local learning methods with global techniques. We also investigate whether reducing the imbalance in the training data by up-sampling or down-sampling would improve the predictive performance.
Citation:
Aijun An, Yuanyuan Wang, "Comparisons of Classification Methods for Screening Potential Compounds," icdm, pp.11, First IEEE International Conference on Data Mining (ICDM'01), 2001
Usage of this product signifies your acceptance of the Terms of Use.