This Article 
 Bibliographic References 
 Add to: 
Combining Joint and Semi-Join Operations for Distributed Query Processing
June 1993 (vol. 5 no. 3)
pp. 534-542

The application of a combination of join and semi-join operations to minimize the amount of data transmission required for distributed query processing is discussed. Specifically, two important concepts that occur with the use of join operations as reducers in query processing, namely, gainful semi-joins and pure joint attributes, are used. Some semi-joint, though not profitable themselves, may benefit the execution of subsequent join operations as reducers. Such a semi-join is termed a gainful semi-join. In addition, join attributes that are not part of the output attributes are referred to as pure join attributes. They exploit the usefulness of gainful semi-joins and use the removability of pure join attributes to reduce the amount of data transmission required for query processing. Heuristic searches are developed to determine a sequence of join and semi-join reducers for query processing. Results indicate the importance of the approach to combining joins and semi-joins for distributed query processing.

[1] P. M. G. Apers, A. R. Hevner, and S. B. Yao, "Optimization algorithms for distributed queries,"IEEE Trans. Software Eng., vol. SE-9, pp. 57- 68, Jan. 1983.
[2] P. Bernstein and D. Chiu, "Using semijoins to solve relational queries,"J. ACM, vol. 28, no. 1, pp. 25-40, Jan. 1981.
[3] P. A. Bernstein, N. Goodman, E. Wong, G. L. Reeve, and J. Rothmie, "Query processing in a system for distributed database (SDD-I),"ACM Trans. Database Syst., vol. 6, Dec. 1981.
[4] S. Ceri and G. Pelagatti,Distributed Databases: Principles and Systems, McGraw-Hill, New York, 1984.
[5] A. L. P. Chen and V. O. K. Li, "Optimizing star queries in distributed database systems," inProc. 10th Int. Conf. Very Large Data Bases, 1984, pp. 429-438.
[6] M.-S. Chen and P.S. Yu, "Using combination of join and semijoins operations for distributed query processing."IBM Res. Rep. RC 14788, June 1989.
[7] M.-S. Chen and P. S. Yu, "Using join operations as reducers in distributed query processing," inProc. 2nd Int. Symp. Databases in Parallel Distributed Syst., July 1990, pp. 116-123.
[8] M.-S. Chen and P.S. Yu, "Interleaving a join sequence iwht semijoins in distributed query processing,"IEEE Trans. Parallel and Distriubuted Syst., vol.3, no. 5, pp. 611-621, Sept. 1992.
[9] D.-M. Chiu, P. A. Bernstein, and Y.-C. Ho, "Optimizing chain queries in a distributed database system,"SIAM J. Comput., vol. 13, pp. 116-134, Feb. 1984.
[10] D. Gardy and C. Puech, "On the effect of join operations on relation dizes,"ACM Trans. Database Syst., vol. 14, no. 4, pp. 574-603, Dec. 1989.
[11] N. Goodman and 0. Shmueli. "The tree property is fundamental for query processing," inProc. ACM Symp. Principles of Database Systems. pp. 40-48, 1982.
[12] A. Hevner, "The optimization of query processing in distributed database systems," Ph.D. dissertation, Purdue Univ., West Lafayette, IN, Dec. 1979.
[13] A. R. Hevner and S. B. Yao, "Query processing in distributed database systems,"IEEE Trans. Software Eng., vol. SE-5, pp. 177-187, May 1979.
[14] Y. Kambayashi, M. Yashikawa, and S. Yajima, "Query processing for distributed databases using generalized semi-joins," inACM Proc. SIGMOD. pp. 151-160, 1982.
[15] H. Kang and N. Roussopoulos, "Combining joins and semijoins in distributed query processing," CS-TR-1794, Univ. Maryland, 1987.
[16] S. Lafortune and E. Wong, "A state transition model for distributed query processing,"ACM Trans. Database Syst., vol. 11, pp. 294- 322, Sept. 1986.
[17] G. M. Lohman, C. Mohan, L. M. Hass, B. G. Lindsay, P. G. Selinger, P. F. Wilms, and D. Daniels, "Query Processing in R*," RJ 4272, IBM Almaden Research Laboratory, San Jose, CA., Apr. 1984.
[18] J. K. Mullin, "Optimal semijoins for distributed database systems,"IEEE Trans. Software Eng., vol. 16, pp. 558-560, May 1990.
[19] N. Nilsson,Principles of Artificial Intelligence. Palo Alto, CA: Tioga, 1980.
[20] S. Pramanik and D. Vineyard, "Optimizing join queries in distributed databases,"IEEE Trans. Software Eng., vol. 14, pp. 1319-1326, Sept. 1988.
[21] A. Segev, "Global heuristic for distributed query optimization," inProc. of IEEE INFOCOM, 1986, pp. 388-394.
[22] C. Wang, "The complexity of processing tree queries in distributed databases," inProc. 2nd IEEE Symp. Parallel and Distributed Processing, pp. 604-611, Dec. 1990.
[23] S. B. Yao, "Approximating block accesses in database organizations,"Commun. ACM, vol. 20, pp. 260-261, Apr. 1977.
[24] H. Yoo and S. Lafortune, "An intelligent search method for query optimization by semijoins,"IEEE Trans. Know. Data Eng., vol. 1, pp. 226-237, June 1989.
[25] C. Yu and C. Chang, "Distributed query processing,"ACM Comput. Surveys, vol. 16, no. 4, pp. 399-433, Dec. 1984.
[26] P.S. Yu, M.-S. Chen, H. Heiss, and S. H. Lee, "On workload characterization of relational database environments,"IEEE Trans. Software Eng., vol. 18, on. 347-355. Apr. 1992.

Index Terms:
join operations; semijoin operations; heuristic searches; distributed query processing; data transmission; reducers; query processing; gainful semi-joins; joint attributes; distributed databases; query processing
M.-S. Chen, P.S. Yu, "Combining Joint and Semi-Join Operations for Distributed Query Processing," IEEE Transactions on Knowledge and Data Engineering, vol. 5, no. 3, pp. 534-542, June 1993, doi:10.1109/69.224205
Usage of this product signifies your acceptance of the Terms of Use.