Issue No. 04 - August (1996 vol. 8)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/69.536256
<p><b>Abstract</b>—While a significant amount of research efforts has been reported on developing algorithms, based on joins and semijoins, to tackle distributed query processing, there is relatively little progress made toward exploring the complexity of the problems studied. As a result, proving NP-hardness of or devising polynomial-time algorithms for certain distributed query optimization problems has been elaborated upon by many researchers. However, due to its inherent difficulty, the complexity of the majority of problems on distributed query optimization remains unknown. In this paper we generally characterize the distributed query optimization problems and provide a frame work to explore their complexity. As it will be shown, most distributed query optimization problems can be transformed into an optimization problem comprising a set of binary decisions, termed Sum Product Optimization (SPO) problem. We first prove SPO is NP-hard in light of the NP-completeness of a well-known problem, Knapsack (KNAP). Then, using this result as a basis, we prove that five classes of distributed query optimization problems, which cover the majority of distributed query optimization problems previously studied in the literature, are NP-hard by polynomially reducing SPO to each of them. The detail for each problem transformation is derived. We not only prove the conjecture that many prior studies relied upon, but also provide a frame work for future related studies.</p>
Distributed query optimization, semijoin processing, complexity, NP-hard problems, distributed databases.
Ming-Syan Chen, Chihping Wang, "On the Complexity of Distributed Query Optimization", IEEE Transactions on Knowledge & Data Engineering, vol. 8, no. , pp. 650-662, August 1996, doi:10.1109/69.536256