2013 IEEE 13th International Conference on Data Mining Workshops (2012)
Brussels, Belgium Belgium
Dec. 10, 2012 to Dec. 10, 2012
There is currently a surge of interest in adaptive learning algorithms for applications ranging from ozone level peak predictions, learning stock market indicators, and detecting smart phone usage patterns. In such scenarios, the detection of change (or drift) in the concept being learned is important to ensure that correct, timely and relevant models are constructed. In addition, such data is often imbalanced and, to further complicate the issue, we are frequently interested in learning the minority class. It follows that ignoring these two aspects during learning may lead to unreliable, or even incorrect, models being built. In this research we discuss the interplay between concept drift detection and imbalanced data sets in order to ensure reliable results. We introduce a novel algorithm that, rather than considering a single performance evaluation measure such as accuracy for change detection, considers all the components of a confusion matrix and employs the cosine similarity coefficient. We evaluate our algorithm against a real world mobile phone database, as well as benchmarking datasets, and we compare it with two other state-of-the-art methods. The results show that our approach is particularly sensitive to concept drifts occurring in imbalanced data sets. Our evaluation indicates that our algorithm is able to detect concept drift reliably. Further, our method is shown to perform very well compared to the other techniques, especially when the drift occurs in the minority class of a class imbalance problem.
Vectors, Change detection algorithms, Data models, Reliability, Accuracy, Training, Prediction algorithms, reliable model building, Concept drift detection, class imbalance, classification
Daniel K. Antwi, Herna L. Viktor, Nathalie Japkowicz, "The PerfSim Algorithm for Concept Drift Detection in Imbalanced Data", 2013 IEEE 13th International Conference on Data Mining Workshops, vol. 00, no. , pp. 619-628, 2012, doi:10.1109/ICDMW.2012.122