loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Sixth IEEE International Conference on Data Mining (ICDM'06)
Frequent Closed Itemset Mining Using Prefix Graphs with an Efficient Flow-Based Pruning Strategy
Hong Kong
December 18-December 22
ISBN: 0-7695-2701-9
H.D.K. Moonesinghe, Michigan State University, USA
Samah Fodeh, Michigan State University, USA
Pang-Ning Tan, Michigan State University, USA
This paper presents PGMiner, a novel graph-based algorithm for mining frequent closed itemsets. Our approach consists of constructing a prefix graph structure and decomposing the database to variable length bit vectors, which are assigned to nodes of the graph. The main advantage of this representation is that the bit vectors at each node are relatively shorter than those produced by existing vertical mining methods. This facilitates fast frequency counting of itemsets via intersection operations. We also devise several internode and intra-node pruning strategies to substantially reduce the combinatorial search space. Unlike other existing approaches, we do not need to store in memory the entire set of closed itemsets that have been mined so far in order to check whether a candidate itemset is closed. This dramatically reduces the memory usage of our algorithm, especially for low support thresholds. Our experiments using synthetic and real-world data sets show that PGMiner outperforms existing mining algorithms by as much as an order of magnitude and is scalable to very large databases.
Citation:
H.D.K. Moonesinghe, Samah Fodeh, Pang-Ning Tan, "Frequent Closed Itemset Mining Using Prefix Graphs with an Efficient Flow-Based Pruning Strategy," icdm, pp.426-435, Sixth IEEE International Conference on Data Mining (ICDM'06), 2006
Usage of this product signifies your acceptance of the Terms of Use.