loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Fourth IEEE International Conference on Data Mining (ICDM'04)
Decision Tree Evolution Using Limited Number of Labeled Data Items from Drifting Data Streams
Brighton, United Kingdom
November 01-November 04
ISBN: 0-7695-2142-8
Wei Fan, IBM T. J. Watson Research, Hawthorne, NY
Yi-an Huang, Georgia Institute of Technology, Atlanta, GA
Philip S. Yu, IBM T. J. Watson Research, Hawthorne, NY
Most previously proposed mining methods on data streams make an unrealistic assumption that "labelled" data stream is readily available and can be mined at anytime. However, in most real-world problems, labelled data streams are rarely immediately available. Due to this reason, models are reconstructed only when labelled data become available periodically. This passive stream mining model has several drawbacks. We propose a new concept of demand-driven active data mining. In active mining, the loss of the model is either continuously guessed without using any true class labels or estimated, whenever necessary, from a small number of instances whose actual class labels are verified by paying an affordable cost. When the estimated loss is more than a tolerable threshold, the model evolves by using a small number of instances with verified true class labels. Previous work on active mining concentrates on error guess and estimation. In this paper, we discuss several approaches on decision tree evolution.
Citation:
Wei Fan, Yi-an Huang, Philip S. Yu, "Decision Tree Evolution Using Limited Number of Labeled Data Items from Drifting Data Streams," icdm, pp.379-382, Fourth IEEE International Conference on Data Mining (ICDM'04), 2004
Usage of this product signifies your acceptance of the Terms of Use.