This Article 
 Bibliographic References 
 Add to: 
Optimizing Join Queries in Distributed Databases
September 1988 (vol. 14 no. 9)
pp. 1319-1326

A reduced cover set of the set of full reducer semijoin programs for an acyclic query graph for a distributed database system is given. An algorithm is presented that determines the minimum cost full reducer program. The computational complexity of finding the optimal full reducer for a single relation is of the same order as that of finding the optimal full reducer for all relations. The optimization algorithm is able to handle query graphs where more than one attribute is common between the relations. A method for determining the optimum profitable semijoin program is presented. A low-cost algorithm which determines a near-optimal profitable semijoin program is outlined. This is done by converting a semijoin program into a partial order graph. This graph also allows one to maximize the concurrent processing of the semijoins. It is shown that the minimum response time is given by the largest cost path of the partial order graph. This reducibility is used as a post optimizer for the SSD-1 query optimization algorithm. It is shown that the least upper bound on the length of any profitable semijoin program is N(N-1) for a query graph of N nodes.

[1] P. Apers, A. Hevner, and S. B. Yao, "Optimization algorithms for distributed queries,"IEEE Trans. Software Engineering, vol. SE-9, no. 1, pp. 57-68, Jan. 1983.
[2] E. Babb, "Implementing a relational database by means of specialized hardware,"ACM TODS, vol. 4, no. 1, pp. 1-29, Mar. 79.
[3] P. Bernstein and D. Chiu, "Using semijoins to solve relational queries,"J. ACM, vol. 28, no. 1, pp. 25-40, Jan. 1981.
[4] P. A. Bernstein, N. Goodman, E. Wong, G. L. Reeve, and J. Rothmie, "Query processing in a system for distributed database (SDD-I),"ACM Trans. Database Syst., vol. 6, Dec. 1981.
[5] D. Chiu and Y. Ho, "A methodology for interpreting tree queries into optimal semi-join expressions," inProc. ACM SIGMOD, May 1980, pp. 169-178.
[6] R. Epstein, M. Stonebraker, and E. Wong, "Distributed query processing in a relational data base system," inProc. 1978 ACM SIGMOD Int. Conf. Management of Data. New York: ACM Press, May 1978, pp. 169-180.
[7] S. Pramanik and F. Fotouhi, "An index database machine--An efficient m-way join processor,"The Comput. J., vol. 29, no. 5, pp. 430-445, Oct. 1986.
[8] M. Stonebraker and E. Neuhold, "A distributed database version of INGRESS," inProc. Second Berkeley Workshop Dist. Data Management and Computer Networks, 1977, pp. 19-36.
[9] S. Su, L. Nguyen, A. Emam, and G. Lipovskky, "The architectural features and implementation techniques of the multicell CASSM,"IEEE Trans. Comput., vol. C-28, no. 6, pp. 430-445, June 1979.
[10] E. Wong, "Retrieving dispersed data from SDD-1: A system of distributed databases," inProc. Second Berkeley Workshop Dist. Data Management and Computer Networks, 1977, pp. 217-235.
[11] C. Yu and C. Chang, "Distributed query processing,"ACM Comput. Surveys, vol. 16, no. 4, pp. 399-433, Dec. 1984.
[12] C. Yu, Z. Ozsoyoglu, and K. Lam, "Optimization of distributed tree queries,"J. Comput. Syst. Sci., vol. 29, no. 3, pp. 409-445, Dec. 1984.

Index Terms:
join queries; distributed databases; reduced cover set; acyclic query graph; distributed database; computational complexity; optimization; partial order graph; concurrent processing; minimum response time; computational complexity; database theory; distributed databases; graph theory; optimisation
S. Pramanik, D. Vineyard, "Optimizing Join Queries in Distributed Databases," IEEE Transactions on Software Engineering, vol. 14, no. 9, pp. 1319-1326, Sept. 1988, doi:10.1109/32.6175
Usage of this product signifies your acceptance of the Terms of Use.