2009 Ninth IEEE International Conference on Data Mining (2009)
Dec. 6, 2009 to Dec. 9, 2009
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDM.2009.54
The recent proliferation of graph data in a wide spectrum of applications has led to an increasing demand for advanced data analysis techniques. In view of this, many graph mining techniques, such as frequent subgraph mining and correlated subgraph mining, have been proposed. In many applications, both frequency and correlation play an important role. Thus, this paper studies a new problem of mining the set of frequent correlated subgraph pairs. A simple algorithm that combines existing algorithms for mining frequent subgraphs and correlated subgraphs results in a multiplication of the mining operations, the majority of which are redundant. We discover that most of the graphs correlated to a common graph are also highly correlated. We establish theoretical foundations for this finding and derive a tight lower bound on the correlation of any two graphs that are correlated to a common graph. This theoretical result leads to the design of a very effective skipping mechanism, by which we skip the processing of a majority of graphs in the mining process. Our algorithm, FCP-Miner, is a fast approximate algorithm, but we show that the missing pairs are only a small set of marginally correlated pairs. Extensive experiments verify both the efficiency and effectiveness of FCP-Miner.
graph mining, Pearson's correlation coefficient, frequent correlated subgraph pairs
Y. Ke, J. X. Yu and J. Cheng, "Efficient Discovery of Frequent Correlated Subgraph Pairs," 2009 Ninth IEEE International Conference on Data Mining(ICDM), Miami, Florida, 2009, pp. 239-248.