10th International Database Engineering and Applications Symposium (IDEAS'06)
Parallelizing XQuery In a Cluster Environment
Delhi, India
December 11-December 14
ISBN: 0-7695-2577-6
In this paper, we report on a parallel implementation of XQuery. As XQuery is being used for processing large datasets, and/or for computeintensive applications, efficiency of XQuery implementations is becoming an important issue. Our work has specifically focused on scienti fic data processing and data mining applications. Parallelization of this class of XQuery queries involves a number of challenges, which include data distribution, parallelization of generalized reductions, and translation to an imperative language like C/C++, so as to invoke effi- cient parallel communication libraries. In this paper, we report our solutions towards the above problems. By implementing the techniques in a compiler and generating code based on a C++ SAX parser and the Message Passing Interface (MPI), we are able to achieve efficient parallel execution on a cluster of machines.