Issue No.07 - July (2011 vol.23)
Jiefeng Cheng , The University of Hong Kong, Hong Kong
Jeffrey Xu Yu , The Chinese University of Hong Kong, Hong Kong
Philip S. Yu , University of Illinois at Chicago, Chicago
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2010.169
Due to rapid growth of the Internet and new scientific/technological advances, there exist many new applications that model data as graphs, because graphs have sufficient expressiveness to model complicated structures. The dominance of graphs in real-world applications demands new graph processing techniques to access large data graphs effectively and efficiently. In this paper, we study a graph pattern matching problem, which is to find all patterns in a large data graph that match a user-given graph pattern. We propose new two-step R-join (reachability join) algorithms with a filter step (R-semijoin) and a fetch step (R-join) by utilizing a new cluster-based join index with graph codes in a relational database context. We also propose two optimization approaches to further optimize sequences of R-joins/R-semijoins. The first approach is based on R-join order selection followed by R-semijoin enhancement, and the second approach is to interleave R-joins with R-semijoins. We conducted extensive performance studies, and confirm the efficiency of our proposed new approaches.
Graph matching, 2-hop labeling, reachability joins, join/semijoin processing.
Jiefeng Cheng, Jeffrey Xu Yu, Philip S. Yu, "Graph Pattern Matching: A Join/Semijoin Approach", IEEE Transactions on Knowledge & Data Engineering, vol.23, no. 7, pp. 1006-1021, July 2011, doi:10.1109/TKDE.2010.169