Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06)
HClustream: A Novel Approach for Clustering Evolving Heterogeneous Data Stream
Hong Kong, China
December 18-December 22
ISBN: 0-7695-2702-7
Recently, the continuously arriving and evolving data stream has become a common phenomenon in many fields, such as sensor networks, web click stream and internet traffic flow. One of the most important mining tasks is clustering. Clustering has attracted extensive research by both the community of machine learning and data mining. Many stream clustering methods have been proposed. These methods have proven to be efficient on specific problems. However, most of these methods are on continuous clustering and few of them are about to solve the heterogeneous clustering problems. In this paper, we propose a novel approach based on the CluStream framework for clustering data stream with heterogeneous features. The centroid of continuous attributes and the histogram of the discrete attributes are used to represent the Micro clusters, and k-prototype clustering algorithm is used to create the Micro clusters and Macro clusters. Experimental results on both synthetic and real data sets show its efficiency.
Citation:
Chunyu Yang, Jie Zhou, "HClustream: A Novel Approach for Clustering Evolving Heterogeneous Data Stream," icdmw, pp.682-688, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06), 2006