• Publication
  • 1996
  • Issue No. 3 - June
  • Abstract - Parallel Optimization of Large Join Queries with Set Operators and Aggregates in a Parallel Environment Supporting Pipeline
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Parallel Optimization of Large Join Queries with Set Operators and Aggregates in a Parallel Environment Supporting Pipeline
June 1996 (vol. 8 no. 3)
pp. 429-445

Abstract—We propose a parallel optimizer for queries containing a large number of joins, as well as set operators and aggregate functions. The platform of execution is a shared-disk multiprocessor machine supporting bushy parallelism and pipeline. Our model partitions the query into almost independent subtrees that can be optimized simultaneously and applies an enhanced variation of the iterative improvement technique on those of the subtrees, which contain a large number of joins. This technique is parallelized, too. In order to estimate the cost of the states constructed during optimization of join subtrees, cost formulae are developed that estimate the cost of relational algebra operators when executed across coalescing pipes.

[1] M.-S. Chen,P.S. Yu,, and K.-L. Wu,“Scheduling and processor allocation for parallel execution of multi-join queries,” Proc. Eighth Int’l Conf. Data Engineering, pp. 58-67, Feb. 1992.
[2] U. Dayal, “Of Nests and Trees: A Unified Approach to Processing Queries that Contain Nested Queries, Aggregates, and Quantifiers,” Proc. Very Large Databases, 1987.
[3] C. Galindo-Legaria, A. Pellenkoft, and M. Kersten, "Fast, Randomized Join-Order Selection—Why Use Transformations?" Proc. Int'l Conf. Very Large Databases, pp. 85-95,Santiago, Chile, 1994.
[4] S. Ganguly, W. Hasan, and R. Krishnamurthy,“Query optimization for parallel execution,”inProc. ACM SIGMOD, June 1992, pp. 9–18.
[5] G. Graefe, "Query Evaluation Techniques for Large Databases," ACM Computing Surveys, vol. 25, no. 2, pp. 73-170, June 1993.
[6] W. Hasan and R. Motwani, "Optimization Algorithms for Exploiting the Parallelism-Communication Tradeoff in Pipelined Parallelism" Proc. Int'l Conf. Very Large Databases, pp. 36-47,Santiago, Chile, 1994.
[7] W. Hong,“Exploiting interoperator parallelism in XPRS,”inProc. ACM SIGMOD, San Diego, CA, June 1992, pp. 19–28.
[8] T. Hu, A.B. Kahng, and C.-W.A. Tsao, "Old Bachelor Acceptance: A New Class of Non-Monotone Threshold Accepting Methods" technical report, UCLA Dept. of Computer Science, Los Angeles, and UC San Diego Computer Science and Engineering Dept., La Jolla, Calif., 1995.
[9] Y.E. Ioannidis and Y.C. Kang,“Randomized algorithms for optimizing large join queries,” Proc. ACM-SIGMOD Conf., vol. 19, pp. 312-321, 1990.
[10] Y.E. Ioannidis and Y.C. Kang,“Left-deep vs. bushy trees: An analysis of strategy spaces and its implication for query optimization,” Proc. ACM-SIGMOD Conf., vol. 20, pp. 168-177, 1991.
[11] Y. Ioannidis, R.T. Ng, K. Shim, and T.K. Sellis, "Parametric Query Optimisation" Proc. Int'l Conf. Very Large Databases, pp. 103-114,Vancouver, Canada, 1992.
[12] Y.E. Ioannidis and E. Wong,“Query optimization by simulated annealing,” Proc. ACM-SIGMOD Conf., pp. 9-22, 1987.
[13] M. Jarke and J. Koch, “Query Optimization in Database Systems,” ACM Computer Surveys, vol. 16, pp. 111–152, 1984.
[14] W. Kim, “On Optimizing an SQL-like Nested Query,” ACM Trans. Data Systems, Sept. 1982.
[15] R. Krishnamurthy, H. Boral, and C. Zaniolo,“Optimization of nonrecursive queries,”inProc. 12th Int. Conf. Very Large Databases, Kyoto, Japan, Aug. 1986, pp. 128–137.
[16] R.S.G. Lanzelotte,P. Valduriez,, and M. Zaït,“On the effectiveness of optimization search strategies for parallel execution spaces,” Proc. 19th Int’l Conf. Very Large Databases, pp. 493-504,Dublin, 1993.
[17] E. Lin, E. Omiecinski, and S. Yalamanchili, "Large Join Optimization on a Hypercube Multiprocessor" IEEE Trans. Knowledge and Data Eng., vol. 6, no. 2, pp. 304-315, 1994.
[18] H. Lu, M.-C. Shan, and K.-L. Tan,“Optimization of multi-way join queries for parallel execution,”inProc. 17th Int. Conf. Very Large Databases, Barcelona, Spain, Sept. 1991, pp. 549–560.
[19] T. Morzy, M. Matysiak, and S. Salza, "Tabu Search Optimization of Large Join Queries" Proc. EDBT '94 Int'l Conf., pp. 309-322,Cambridge, U.K., Springer-Verlag, 1994.
[20] C.H. Papadimitriu and K. Steiglitz, Combinatorial Optimization: Algorithms and Complexity. Prentice Hall, 1987.
[21] G. Patil, M. Boswell, S. Joshi, and M. Ratnaparkhi, "Discrete Models" Dictionary and Classified Bibliography of Statistical Distributions in Scientific Work, vol. 1. Maryland: International Cooperative Publications House, 1984.
[22] D.A. Schneider, "Complex Query Processing in Multiprocessor Database Machines" Technical Report TR965, Univ. of Wisconsin, Madison, 1990.
[23] P. Selinger,D. Astrahan,D. Chamberlin,R. Lorie,, and T. Price,“Access path selection in a relational database management system,” Proc. 1979 ACM-SIGMOD Int’l Conf. Management of Data, pp. 23-34,Boston, May 1979.
[24] L. Shapiro, "Join Processing in Database Systems with Large Main Memories," ACM Trans. Database Systems, vol. 11, no. 3, Sept. 1986.
[25] E. Shekita, H. C. Young, and K. Tan,“Multijoin optimization for symmetric multiprocessors,”inProc. 19th Int. Conf. Very Large Databases, Aug. 1993, pp. 479–492.
[26] M. Spiliopoulou, "Parallel Optimization and Execution of Queries towards an RDBMS in a Parallel Environment Supporting Pipeline" (in Greek), PhD thesis, Dept. of Informatics, Univ. of Athens, Athens, Greece, 1992.
[27] M. Spiliopoulou, Y. Cotronis, and M. Hatzopoulos, "Parallel Optimisation of Join Queries Using an Enhanced Iterative Improvement Technique" Proc. 1993 PARLE Conf., Poster Session, pp. 716-719,Munich, Germany, 1993.
[28] M. Spiliopoulou and J.C. Freytag, "Modelling Resource Utilization in Pipelined Query Execution" Proc. Euro-Par Conf.,Lyon, France, to appear in 1996.
[29] M. Spiliopoulou and M. Hatzopoulos, "Translation of SQL Queries into a Graph Structure: Query Transformations and Pre-optimisation Issues in a Pipeline Multiprocessor Environment" Information Systems, vol. 17, no. 2, pp. 161-170, 1992.
[30] M. Spiliopoulou, M. Hatzopoulos, and C. Vassilakis, "Using Parallelism and Pipeline for the Optimisation of Join Queries" Proc. 1992 PARLE Conf., pp. 279-294,Paris, 1992.
[31] M. Spiliopoulou, M. Hatzopoulos, and C. Vassilakis, "Parallel Optimization of Join Queries Using a Technique of Exhaustive Nature" Computers and Artificial Intelligence, vol. 12, no. 2, pp. 145-166, 1993.
[32] M. Spiliopoulou, M. Hatzopoulos, and C. Vassilakis, "A Cost Model for the Estimation of Query Execution Time in a Parallel Environment Supporting Pipeline" Computers and Artificial Intelligence, to appear in 1996.
[33] M. Steinbrunn, G. Moerkotte, and A. Kemper, "Optimizing Join Orders" Technical Report MIP9307, Faculty of Mathematic, Univ. of Passau, Passau, Germany, 1993.
[34] A. Swami,“Optimization of large join queries: Combining heuristics with combinatorial techniques,”inProc. ACM SIGMOD, Chicago, IL, June 1989, pp. 367–376.
[35] A. Swami and A. Gupta,“Optimization of large join queries,” Proc. ACM-SIGMOD Conf., pp. 8-17, 1988.
[36] M. Ziane, M. Zaït, and P. Borla-Salamet, "Parallel Query Processing with Zigzag Trees" The VLDB J., vol. 2, no. 3, pp. 277-301, 1993.

Index Terms:
Parallel query optimization, parallelism in optimization, iterative improvement, large join queries, bushy parallelism, pipeline, shared-disk architectures, query optimization, parallelism, databases.
Citation:
Myra Spiliopoulou, Michael Hatzopoulos, Yannis Cotronis, "Parallel Optimization of Large Join Queries with Set Operators and Aggregates in a Parallel Environment Supporting Pipeline," IEEE Transactions on Knowledge and Data Engineering, vol. 8, no. 3, pp. 429-445, June 1996, doi:10.1109/69.506710
Usage of this product signifies your acceptance of the Terms of Use.