21st International Conference on Advanced Networking and Applications (AINA '07)
Efficient Query Processing for Large XML Data in Distributed Environments
Niagara Falls, Ontario, Canada
May 21-May 23
ISBN: 0-7695-2846-5
We propose an efficient distributed query processing method for large XML data by partitioning and distributing XML data to multiple computation nodes. There are several steps involved in this method; however, we focused particularly on XML data partitioning and dynamic relocation of partitioned XML data in our research. Since the efficiency of query processing depends on both XML data size and its structure, these factors should be considered when XML data is partitioned. Each partitioned XML data is distributed to computation nodes so that the CPU load can be balanced. In addition, it is important to take account of the query workload among each of the computation nodes because it is closely related to the query processing cost in distributed environments. In case of load skew among computation nodes, partitioned XML data should be relocated to balance the CPU load. Thus, we implemented an algorithm for relocating partitioned XML data based on the CPU load of query processing. From our experiments, we found that there is a performance advantage in our approach for executing distributed query processing of large XML data.
Citation:
Hiroto Kurita, Kenji Hatano, Jun Miyazaki, Shunsuke Uemura, "Efficient Query Processing for Large XML Data in Distributed Environments," aina, pp.317-322, 21st International Conference on Advanced Networking and Applications (AINA '07), 2007