Proceedings 18th International Conference on Data Engineering (2002)
San Jose, California
Feb. 26, 2002 to Mar. 1, 2002
Shurug Al-Khalifa , Univ of Michigan
H. V. Jagadish , Univ of Michigan
Jignesh M. Patel , Univ of Michigan
Yuqing Wu , Univ of Michigan
Nick Koudas , AT&T Labs-Research
Divesh Srivastava , AT&T Labs-Research
XML queries typically specify patterns of selection predicates on multiple elements that have some specified tree structured relationships. The primitive tree structured relationships are parent-child and ancestor-descendant, and finding all occurrences of these relationships in an XML database is a core operation for XML query processing.In this paper, we develop two families of structural join algorithms for this task: tree-merge and stack-tree. The tree-merge algorithms are a natural extension of traditional merge joins and the recently proposed multi-predicate merge joins, while the stack-tree algorithms have no counterpart in traditional relational join processing. We present experimental results on a range of data and queries using the TIMBER native XML query engine built on top of SHORE. We show that while, in some cases, tree-merge algorithms can have performance comparable to stack-tree algorithms, in many cases they are considerably worse. This behavior is explained by analytical results that demonstrate that, on sorted inputs, the stack-tree algorithms have worst-case I/O and CPU complexities linear in the sum of the sizes of inputs and output, while the tree-merge algorithms do not have the same guarantee.
Y. Wu, S. Al-Khalifa, D. Srivastava, J. M. Patel, N. Koudas and H. V. Jagadish, "Structural Joins: A Primitive for Efficient XML Query Pattern Matching," Proceedings 18th International Conference on Data Engineering(ICDE), San Jose, California, 2002, pp. 0141.