2016 International Conference on Big Data and Smart Computing (BigComp) (2016)
Hong Kong, China
Jan. 18, 2016 to Jan. 20, 2016
Serin Lee , Department of Computer Science, Sungshin Women's University, Seoul, Republic of Korea
Jong Soo Park , School of Information Technology, Sungshin Women's University, Seoul, Republic of Korea
Top-k high utility itemset mining refers to the discovery of top-k patterns using a user-specified value k by considering the utility of items in a transactional database. Since existing top-k high utility itemset mining algorithms are based on the pattern-growth method, they search the patterns in two steps. Therefore, the generation of many candidates and additional database scan for calculating exact utilities are unavoidable. In this paper, we propose a new algorithm, TKUL-Miner, to mine top-k high utility itemsets efficiently. It utilizes a new utility-list structure which stores necessary information at each node on the search tree for mining the itemsets. The proposed algorithm has a strategy using search order for specific region to raise the border minimum utility threshold rapidly. Moreover, two additional strategies for calculating smaller overestimated utilities are suggested to prune unpromising itemsets effectively. Experimental results on various datasets showed that the TKUL-Miner outperforms other recent algorithms both in runtime and memory efficiency.
Itemsets, Algorithm design and analysis, Data mining, Memory management, Data structures, Computer science
S. Lee and J. S. Park, "Top-k high utility itemset mining based on utility-list structures," 2016 International Conference on Big Data and Smart Computing (BigComp)(BIGCOMP), Hong Kong, China, 2016, pp. 101-108.