2008 IEEE 24th International Conference on Data Engineering (2008)
Apr. 7, 2008 to Apr. 12, 2008
Jiefeng Cheng , The Chinese University of Hong Kong, China. firstname.lastname@example.org
Jeffrey Xu Yu , The Chinese University of Hong Kong, China. email@example.com
Bolin Ding , The Chinese University of Hong Kong, China. firstname.lastname@example.org
Philip S. Yu , University of Illinois at Chicago, USA. email@example.com
Haixun Wang , T. J. Watson Research Center, IBM, USA. firstname.lastname@example.org
Due to rapid growth of the Internet technology and new scientific/technological advances, the number of applications that model data as graphs increases, because graphs have high expressive power to model complicated structures. The dominance of graphs in real-world applications asks for new graph data management so that users can access graph data effectively and efficiently. In this paper, we study a graph pattern matching problem over a large data graph. The problem is to find all patterns in a large data graph that match a user-given graph pattern. We propose a new two-step R-join (reachability join) algorithm with filter step and fetch step based on a cluster-based join-index with graph codes. We consider the filter step as an R-semijoin, and propose a new optimization approach by interleaving R-joins with R-semijoins. We conducted extensive performance studies, and confirm the efficiency of our proposed new approaches.
H. Wang, B. Ding, J. X. Yu, P. S. Yu and J. Cheng, "Fast Graph Pattern Matching," 2008 IEEE 24th International Conference on Data Engineering(ICDE), Cancun, Mexico, 2008, pp. 913-922.