Issue No. 07 - July (2014 vol. 26)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2013.27
Guimei Liu , Data Analytics Department, Institute for Infocomm Research, Singapore
Haojun Zhang , Department of Computer Science, National University of Singapore, Singapore
Limsoon Wong , Department of Computer Science, National University of Singapore, Singapore
Frequent pattern mining often produces an enormous number of frequent patterns, which imposes a great challenge on visualizing, understanding and further analysis of the generated patterns. This calls for finding a small number of representative patterns to best approximate all other patterns. In this paper, we develop an algorithm called MinRPset to find a minimum representative pattern set with error guarantee. MinRPset produces the smallest solution that we can possibly have in practice under the given problem setting, and it takes a reasonable amount of time to finish when the number of frequent closed patterns is below one million. MinRPset is very space-consuming and time-consuming on some dense datasets when the number of frequent closed patterns is large. To solve this problem, we propose another algorithm called FlexRPset, which provides one extra parameter K to allow users to make a trade-off between result size and efficiency. We adopt an incremental approach to let the users make the trade-off conveniently. Our experiment results show that MinRPset and FlexRPset produce fewer representative patterns than RPlocal-an efficient algorithm that is developed for solving the same problem.
G. Liu, H. Zhang and L. Wong, "A Flexible Approach to Finding Representative Pattern Sets," in IEEE Transactions on Knowledge & Data Engineering, vol. 26, no. 7, pp. 1562-1574, 2014.