Issue No. 09 - September (2007 vol. 19)
This paper addresses the problem of finding frequent closed patterns (FCPs) from very dense datasets. We introduce two compressed hierarchical FCP mining algorithms C-Miner and B-Miner. The two algorithms compress the original mining space, hierarchically partition the whole mining task into independent subtasks and mine each subtask progressively. The two algorithms adopt different task-partitioning strategies: CMiner partitions the mining task based on Compact Matrix Division whereas B-Miner partitions the task based on Base Rows Projection. The compressed hierarchical mining algorithms enhance the mining efficiency and facilitate a progressive refinement of results. Moreover, because the subtasks can be mined independently, C-Miner and B-Miner can be readily parallelized without incurring significant communication overhead. We have implemented C-Miner and B-Miner, and our performance study on synthetic datasets and real dense microarray datasets shows their effectiveness over existing schemes. We also report experimental results on parallel versions of these two methods.
Frequent closed patterns, progressive, dense datasets, data mining, parallel mining
L. Ji, K. Tan and A. Tung, "Compressed Hierarchical Mining of Frequent Closed Patterns from Dense Data Sets," in IEEE Transactions on Knowledge & Data Engineering, vol. 19, no. , pp. 1175-1187, 2007.