Subscribe

Issue No.02 - February (2010 vol.22)

pp: 264-277

Carlos Ordonez , University of Houston, Houston

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2009.83

ABSTRACT

Recursion is a fundamental computation mechanism which has been incorporated into the SQL language. This work focuses on the optimization of linear recursive queries in SQL. Query optimization is studied with two important graph problems: computing the transitive closure of a graph and getting the power matrix of its adjacency matrix. We present SQL implementations for two fundamental algorithms: Seminaive and Direct. Five query optimizations are studied: 1) Storage and indexing; 2) early selection; 3) early evaluation of nonrecursive joins; 4) pushing duplicate elimination; and 5) pushing aggregation. Experiments compare both evaluation algorithms and systematically evaluate the impact of optimizations with large input tables. Optimizations are evaluated on four types of graphs: binary trees, lists, cyclic graphs, and complete graphs, going from the best to worst case. In general, Seminaive is faster than Direct, except for complete graphs. Storing and indexing rows by vertex and pushing aggregation work well on trees, lists, and cyclic graphs. Pushing duplicate elimination is essential for complete graphs, but slows computation for acyclic graphs. Early selection with equality predicates significantly accelerates computation for all types of graphs.

INDEX TERMS

Recursive query, SQL, query optimization, transitive closure.

CITATION

Carlos Ordonez, "Optimization of Linear Recursive Queries in SQL",

*IEEE Transactions on Knowledge & Data Engineering*, vol.22, no. 2, pp. 264-277, February 2010, doi:10.1109/TKDE.2009.83REFERENCES