Aug. 14, 2009 to Aug. 16, 2009
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/FSKD.2009.553
Data stream clustering is an important task in data stream mining. In this paper, we propose SDStream, a new method for performing density-based data streams clustering over sliding windows. SDStream adopts CluStream clustering framework. In the online component, the potential core-micro-cluster and outlier micro-cluster structures are introduced to maintain the potential clusters and outliers. They are stored in the form of Exponential Histogram of Cluster Feature (EHCF) in main memory and are maintained by the maintenance of EHCFs. Outdated micro-clusters which need to be deleted are found by the value of t in Temporal Cluster Feature (TCF). In the offline component, the final clusters of arbitrary shape are generated according to all the potential core-micro-clusters maintained online by DBSCAN algorithm. Experimental results show that SDStream which can generate clusters of arbitrary shape has a much higher clustering quality than CluStream which generates spherical clusters.
data stream, density-based clustering, sliding windows
Jiadong Ren, Ruiqing Ma, "Density-Based Data Streams Clustering over Sliding Windows", FSKD, 2009, Fuzzy Systems and Knowledge Discovery, Fourth International Conference on, Fuzzy Systems and Knowledge Discovery, Fourth International Conference on 2009, pp. 248-252, doi:10.1109/FSKD.2009.553