The Community for Technology Leaders
Green Image
Issue No. 01 - Jan. (2016 vol. 28)
ISSN: 1041-4347
pp: 54-67
Vincent S. Tseng , Department of Computer Science, National Chao Tung University, Hsinchu City, Taiwan
Cheng-Wei Wu , Department of Computer Science, National Chao Tung University, Hsinchu City, Taiwan
Philippe Fournier-Viger , Department of Computer Science, University of Moncton, Moncton, NB, Canada
Philip S. Yu , Department of Computer Science, University of Illinois at Chicago, Chicago, IL
ABSTRACT
High utility itemsets (HUIs) mining is an emerging topic in data mining, which refers to discovering all itemsets having a utility meeting a user-specified minimum utility threshold min_util. However, setting min_util appropriately is a difficult problem for users. Generally speaking, finding an appropriate minimum utility threshold by trial and error is a tedious process for users. If min_util is set too low, too many HUIs will be generated, which may cause the mining process to be very inefficient. On the other hand, if min_util is set too high, it is likely that no HUIs will be found. In this paper, we address the above issues by proposing a new framework for top-k high utility itemset mining, where k is the desired number of HUIs to be mined. Two types of efficient algorithms named TKU ( mining Top-K Utility itemsets) and TKO (mining Top-K utility itemsets in One phase) are proposed for mining such itemsets without the need to set min_util. We provide a structural comparison of the two algorithms with discussions on their advantages and limitations. Empirical evaluations on both real and synthetic datasets show that the performance of the proposed algorithms is close to that of the optimal case of state-of-the-art utility mining algorithms.
INDEX TERMS
Itemsets, Data mining, Algorithm design and analysis, Computer science, Adaptation models, Memory management,top-k pattern mining, Utility mining, high utility itemset, frequent itemset mining,top-k high utility itemset mining, Utility mining, high utility itemset mining, top- k pattern mining
CITATION
Vincent S. Tseng, Cheng-Wei Wu, Philippe Fournier-Viger, Philip S. Yu, "Efficient Algorithms for Mining Top-K High Utility Itemsets", IEEE Transactions on Knowledge & Data Engineering, vol. 28, no. , pp. 54-67, Jan. 2016, doi:10.1109/TKDE.2015.2458860
191 ms
(Ver 3.3 (11022016))