The Community for Technology Leaders
Green Image
Issue No. 05 - May (2016 vol. 28)
ISSN: 1041-4347
pp: 1245-1257
Junqiang Liu , School of Information and Electronic Engineering, Zhejiang Gongshang University, Hangzhou, China
Ke Wang , School of Computing Science, Simon Fraser University, Burnaby, BC, Canada
Benjamin C.M. Fung , School of Information Studies, McGill University, Montreal, QC, Canada
ABSTRACT
Utility mining is a new development of data mining technology. Among utility mining problems, utility mining with the itemset share framework is a hard one as no anti-monotonicity property holds with the interestingness measure. Prior works on this problem all employ a two-phase, candidate generation approach with one exception that is however inefficient and not scalable with large databases. The two-phase approach suffers from scalability issue due to the huge number of candidates. This paper proposes a novel algorithm that finds high utility patterns in a single phase without generating candidates. The novelties lie in a high utility pattern growth approach, a lookahead strategy, and a linear data structure. Concretely, our pattern growth approach is to search a reverse set enumeration tree and to prune search space by utility upper bounding. We also look ahead to identify high utility patterns without enumeration by a closure property and a singleton property. Our linear data structure enables us to compute a tight bound for powerful pruning and to directly identify high utility patterns in an efficient and scalable way, which targets the root cause with prior algorithms. Extensive experiments on sparse and dense, synthetic and real world data suggest that our algorithm is up to 1 to 3 orders of magnitude more efficient and is more scalable than the state-of-the-art algorithms.
INDEX TERMS
Data mining, Itemsets, Data structures, Scalability, Knowledge engineering, Data engineering,high utility patterns, data mining, utility mining,pattern mining, Data mining, utility mining, high utility patterns, frequent patterns
CITATION
Junqiang Liu, Ke Wang, Benjamin C.M. Fung, "Mining High Utility Patterns in One Phase without Generating Candidates", IEEE Transactions on Knowledge & Data Engineering, vol. 28, no. , pp. 1245-1257, May 2016, doi:10.1109/TKDE.2015.2510012
174 ms
(Ver 3.3 (11022016))