loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
16th International Conference on Scientific and Statistical Database Management (SSDBM'04)
A Fast Algorithm for Subspace Clustering by Pattern Similarity
Santorini Island, Greece
June 21-June 23
ISBN: 0-7695-2146-0
Haixun Wang, IBM T.J. Watson Research Center
Fang Chu, Univ. of California, Los Angeles
Wei Fan, IBM T.J. Watson Research Center
Philip S. Yu, IBM T.J. Watson Research Center
Jian Pei, SUNY Buffalo
Unlike traditional clustering methods that focus on grouping objects with similar values on a set of dimensions, clustering by pattern similarity finds objects that exhibit a coherent pattern of rise and fall in subspaces. Pattern-based clustering extends the concept of traditional clustering and bene ts a wide range of applications, including large scale scientific data analysis, target marketing, web usage analysis, etc. However, state-of-the-art pattern-based clustering methods (e.g., the pCluster algorithm) can only handle datasets of thousands of records, which makes them inappropriate for many real-life applications. Furthermore, besides the huge data volume, many data sets are also characterized by their sequentiality, for instance, customer purchase records and network event logs are usually modeled as data sequences. Hence, it becomes important to enable pattern-based clustering methods i) to handle large datasets, and ii) to discover pattern similarity embedded in data sequences. In this paper, we present a novel algorithm that offers this capability. Experimental results from both real life and synthetic datasets prove its effectiveness and efficiency.
Citation:
Haixun Wang, Fang Chu, Wei Fan, Philip S. Yu, Jian Pei, "A Fast Algorithm for Subspace Clustering by Pattern Similarity," ssdbm, pp.51, 16th International Conference on Scientific and Statistical Database Management (SSDBM'04), 2004
Usage of this product signifies your acceptance of the Terms of Use.