2002 IEEE International Conference on Data Mining, 2002. Proceedings. (2002)
Maebashi City, Japan
Dec. 9, 2002 to Dec. 12, 2002
Yabo Xu , Chinese University of Hong Kong
Jeffrey Xu Yu , Chinese University of Hong Kong
Guimei Liu , Hong Kong University of Science and Technology
Hongjun Lu , Hong Kong University of Science and Technology
In this paper, we propose a new framework for mining frequent patterns from large transactional databases. The core of the framework is of a novel coded prefix-path tree with two representations, namely, a memory-based prefix-path tree and a disk-based prefix-path tree. The disk-based prefix-path tree is simple in its data structure yet rich in information contained, and is small in size. The memory-based prefix-path tree is simple and compact. Upon the memory-based prefix-path tree, a new depth-first frequent pattern discovery algorithm, called P P-Mine, is proposed in this paper that outperforms FP-growth significantly. The memory-based prefix-path tree can be stored on disk using a disk-based prefix-path tree with assistance of the new coding scheme. We present efficient loading algorithms to load the minimal required disk-based prefix-path tree into main memory. Our technique is to push constraints into the loading process, which has not been well studied yet.
J. X. Yu, H. Lu, G. Liu and Y. Xu, "From Path Tree To Frequent Patterns: A Framework for Mining Frequent Patterns," 2002 IEEE International Conference on Data Mining, 2002. Proceedings.(ICDM), Maebashi City, Japan, 2002, pp. 514.