loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
18th International Conference on Database and Expert Systems Applications (DEXA 2007)
A Binary Decision Diagram to discover low threshold support frequent itemsets
Regensburg, Germany
September 03-September 07
ISBN: 0-7695-2932-1
Wassim Ayadi, Faculty of Sciences of Tunisia, Tunisia
Khedija Arour, INSAT, Tunisia
Discovering association rules that identify relationships among sets of items is an important problem in data mining. Finding frequent itemsets is computationally the most expensive step in association rule discovery and therefore it has grasped significant research focus [1]. Discovery of frequently occurring subsets of items, called itemsets, is the core of many data mining methods. Most of the previous studies adopt Apriorilike algorithms, whom iteratively generate candidate itemsets and check their occurrence frequencies in the database. These approaches suffer from serious costs of repeated passes over the analyzed database. In this paper, we propose a new BDD-based (Binary Decision Diagram) data structure called TREESUPBDD. The TREESUPBDD extends the idea claimed by the authors of FP-TREE [9] and ITL-Tree [5] structures, aiming to improve storage compression and to allow frequent pattern mining without an ?explicit? candidate itemset generation step. To address this problem, we propose a novel method, called TREESUPBDDMINE, for reducing database activity of frequent itemset discovery algorithms. The idea of TREESUPBDD-MINE consists in using a Binary Decision Diagram and a tree for representing both database and frequent itemsets. The proposed method requires one scan over the source database : to create the associated tree and BDD and check discovered itemset supports. The originality of our work stands on the fact that the proposed algorithm extracts the frequent itemsets directly from the TREESUPBDD. Carried out experiments showed very encouraged results. Its performance improvements have been shown in a series of our experiments. We extend the binary decision diagram structure to store transaction groups and propose a new method to discover frequents itemsets. To study the trade-offs in the new representation of transactions in binary decision diagram, we compare the performance of our algorithm with the fastest Apriori [2] implementation algorithm and the latest extension of FP-Growth [15]. We have tested all the algorithms using different benchmark datasets. The performance study shows that the new algorithm significantly reduces the processing time for mining frequent itemsets from dense datasets that contain relatively long patterns and for low threshold. We discuss the performance results in detail and also the strengths and limitations of our algorithm.
Index Terms:
data mining, association rules, frequent item sets, binary decision diagram
Citation:
Wassim Ayadi, Khedija Arour, "A Binary Decision Diagram to discover low threshold support frequent itemsets," dexa, pp.509-513, 18th International Conference on Database and Expert Systems Applications (DEXA 2007), 2007
Usage of this product signifies your acceptance of the Terms of Use.