The Community for Technology Leaders
Green Image
Issue No. 09 - September (2011 vol. 23)
ISSN: 1041-4347
pp: 1388-1405
Yong Liu , Harbin Institute of Technology, Harbin
Jianzhong Li , Harbin Institute of Technology, Harbin
Hong Gao , Harbin Institute of Technology, Harbin
We investigate the problem of summarizing frequent subgraphs by a smaller set of representative patterns. We show that some special graph patterns, called \delta\hbox{-}jump patterns in this paper, must be representative patterns. Based on the fact, we devise two algorithms, RP-FP and RP-GD, to mine a representative set that summarizes frequent subgraphs. RP-FP derives a representative set from frequent closed subgraphs, whereas RP-GD mines a representative set from graph databases directly. Three novel heuristic strategies, Last-Succeed-First-Check, Reverse-Path-Trace, and Nephew-Representative-Based-Cover, are proposed to further improve the efficiency of RP-GD. RP-FP can provide a tight ratio bound but has heavy computation cost. RP-GD cannot provide a ratio bound guarantee but is more efficient than RP-FP. We also make use of the similarity between sibling branches in the graph pattern space to devise another much more efficient algorithm, RP-Leap, for mining a representative set that can approximately summarize frequent subgraphs. Our extensive experiments on both real and synthetic data sets verify the summarization quality and efficiency of our algorithms. To further demonstrate the interestingness of representative patterns, we study an application of representative patterns to classification. We demonstrate that the classification accuracy achieved by representative pattern-based model is no less than that achieved by closed graph pattern-based model.
Data mining, graph mining, pattern summarization.
Yong Liu, Jianzhong Li, Hong Gao, "Efficient Algorithms for Summarizing Graph Patterns", IEEE Transactions on Knowledge & Data Engineering, vol. 23, no. , pp. 1388-1405, September 2011, doi:10.1109/TKDE.2010.249
110 ms
(Ver 3.1 (10032016))