Subscribe
Issue No.02 - February (2010 vol.22)
pp: 264-277
Carlos Ordonez , University of Houston, Houston
ABSTRACT
Recursion is a fundamental computation mechanism which has been incorporated into the SQL language. This work focuses on the optimization of linear recursive queries in SQL. Query optimization is studied with two important graph problems: computing the transitive closure of a graph and getting the power matrix of its adjacency matrix. We present SQL implementations for two fundamental algorithms: Seminaive and Direct. Five query optimizations are studied: 1) Storage and indexing; 2) early selection; 3) early evaluation of nonrecursive joins; 4) pushing duplicate elimination; and 5) pushing aggregation. Experiments compare both evaluation algorithms and systematically evaluate the impact of optimizations with large input tables. Optimizations are evaluated on four types of graphs: binary trees, lists, cyclic graphs, and complete graphs, going from the best to worst case. In general, Seminaive is faster than Direct, except for complete graphs. Storing and indexing rows by vertex and pushing aggregation work well on trees, lists, and cyclic graphs. Pushing duplicate elimination is essential for complete graphs, but slows computation for acyclic graphs. Early selection with equality predicates significantly accelerates computation for all types of graphs.
INDEX TERMS
Recursive query, SQL, query optimization, transitive closure.
CITATION
Carlos Ordonez, "Optimization of Linear Recursive Queries in SQL", IEEE Transactions on Knowledge & Data Engineering, vol.22, no. 2, pp. 264-277, February 2010, doi:10.1109/TKDE.2009.83
REFERENCES
 [1] S. Abiteboul, R. Hull, and V. Vianu, Foundations of Databases. Addison-Wesley, 1995. [2] R. Agrawal, S. Dar, and H.V. Jagadish, “Direct and Transitive Closure Algorithms: Design and Performance Evaluation,” ACM Trans. Database Systems, vol. 15, no. 3, pp. 427-458, 1990. [3] F. Bancilhon and R. Ramakrishnan, “An Amateur's Introduction to Recursive Query Processing Strategies,” Proc. ACM SIGMOD, pp. 16-52, 1986. [4] S. Chaudhuri, “An Overview of Query Optimization in Relational Systems,” Proc. ACM Principles of Database Systems (PODS) Conf., pp. 84-93, 1998. [5] S. Dar and R. Agrawal, “Extending SQL with Generalized Transitive Closure,” IEEE Trans. Knowledge and Data Eng., vol. 5, no. 5, pp. 799-812, Oct. 1993. [6] G. Dong and J. Su, “Incremental Maintenance of Recursive Views Using Relational Calculus/SQL,” SIGMOD Record, vol. 29, no. 1, pp. 44-51, 2000. [7] G. Graefe, “Query Evaluation Techniques for Large Databases,” ACM Computing Surveys, vol. 25, no. 2, pp. 73-170, 1993. [8] J. Gray, A. Bosworth, A. Layman, and H. Pirahesh, “Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab and Sub-Total,” Proc. Int'l Conf. Data Eng. (ICDE), pp. 152-159, 1996. [9] J. Han and L.J. Henschen, “Handling Redundancy in the Processing of Recursive Database Queries,” Proc. ACM SIGMOD, pp. 73-81, 1987. [10] Y.E. Ioannidis, R. Ramakrishnan, and L. Winger, “Transitive Closure Algorithms Based on Graph Traversal,” ACM Trans. Database Systems, vol. 18, no. 3, pp. 512-576, 1993. [11] K. Koymen and Q. Cai, “SQL${}^{\ast}$ : A Recursive SQL,” Information Systems, vol. 18, no. 2, pp. 121-128, 1993. [12] L. Libkin and L. Wong, “Incremental Recomputation of Recursive Queries with Nested Sets and Aggregate Functions,” Proc. Int'l Workshop Database Programming Languages (DBPL), pp. 222-238, 1997. [13] V. Linnemann, “Non First Normal Form Relations and Recursive Queries: An SQL-Based Approach,” Proc. IEEE Int'l Conf. Data Eng. (ICDE), pp. 591-598, 1987. [14] I.S. Mumick, S.J. Finkelstein, H. Pirahesh, and R. Ramakrishnan, “Magic Is Relevant,” Proc. ACM SIGMOD, pp. 247-258, 1990. [15] I.S. Mumick, S.J. Finkelstein, H. Pirahesh, and R. Ramakrishnan, “Magic Conditions,” ACM Trans. Database Systems, vol. 21, no. 1, pp. 107-155, 1996. [16] I.S. Mumick and H. Pirahesh, “Implementation of Magic-Sets in a Relational Database System,” Proc. ACM SIGMOD, pp. 103-114, 1994. [17] C. Ordonez, “Optimizing Recursive Queries in SQL,” Proc. ACM SIGMOD, pp. 834-839, 2005. [18] R. Ramakrishnan, D. Srivastava, S. Sudarshan, and P. Seshadri, “Implementation of the Coral Deductive Database System,” Proc. ACM SIGMOD, pp. 167-176, 1993. [19] R. Ramakrishnan, D. Srivastava, S. Sudarshan, and P. Seshadri, “The CORAL Deductive System,” Very Large Data Bases J., vol. 3, no. 2, pp. 161-2120, 1994. [20] S. Seshadri and J.F. Naughton, “On the Expected Size of Recursive Datalog Queries,” Proc. ACM Principles of Database Systems (PODS) Conf., pp. 268-279, 1991. [21] S. Sippu and E.S. Soininen, “An Analysis of Magic Sets and Related Optimization Strategies for Logic Queries,” J. ACM, vol. 43, no. 6, pp. 1046-1088, 1996. [22] J.D. Ullman, “Implementation of Logical Query Languages for Databases,” ACM Trans. Database Systems, vol. 10, no. 3, pp. 289-321, 1985. [23] P. Valduriez and H. Boral, “Evaluation of Recursive Queries Using Join Indices,” Proc. Int'l Conf. Expert Database Systems, pp.271-293, 1986. [24] M.Y. Vardi, “Decidability and Undecidability Results for Boundedness of Linear Recursive Queries,” Proc. ACM Principles of Database Systems (PODS) Conf., pp. 341-351, 1988. [25] H.S. Warren, “A Modification of Warhsall's Algorithm for the Transitive Closure of Binary Relations,” Comm. ACM, vol. 18, no. 4, pp. 218-220, 1975. [26] C. Youn, H. Kim, L.J. Henschen, and J. Han, “Classification and Compilation of Linear Recursive Queries in Deductive Databases,” IEEE Trans. Knowledge and Data Eng., vol. 4, no. 1, pp. 52-67, Feb. 1992.