The Community for Technology Leaders
Green Image
Issue No. 06 - November/December (2008 vol. 12)
ISSN: 1089-7801
pp: 37-49
Jing Gao , University of Illinois, Urbana-Champaign
Philip S. Yu , University of Illinois, Chicago
Wei Fan , IBM T.J. Watson Research Center
Bolin Ding , University of Illinois, Urbana-Champaign
Jiawei Han , University of Illinois, Urbana-Champaign
ABSTRACT
Classification is an important data analysis tool that uses a model built from historical data to predict class labels for new observations. More and more applications are featuring data streams, rather than finite stored data sets, which are a challenge for traditional classification algorithms. Concept drifts and skewed distributions, two common properties of data stream applications, make the task of learning in streams difficult. The authors aim to develop a new approach to classify skewed data streams that uses an ensemble of models to match the distribution over under-samples of negatives and repeated samples of positives.
INDEX TERMS
data stream, classification algorithms, concept drifts, data mining, model averaging, skewed distributions
CITATION
Jing Gao, Philip S. Yu, Wei Fan, Bolin Ding, Jiawei Han, "Classifying Data Streams with Skewed Class Distributions and Concept Drifts", IEEE Internet Computing, vol. 12, no. , pp. 37-49, November/December 2008, doi:10.1109/MIC.2008.119
94 ms
(Ver )