This Article 
 Bibliographic References 
 Add to: 
Projective Distribution of XQuery with Updates
August 2010 (vol. 22 no. 8)
pp. 1059-1076
Ying Zhang, Centrum Wiskunde & Informatica, Amsterdam
Nan Tang, Centrum Wiskunde & Informatica, Amsterdam
Peter Boncz, Centrum Wiskunde & Informatica, Amsterdam
We investigate techniques to automatically decompose any XQuery query—including updating queries specified by the XQuery Update Facility (XQUF)—into subqueries, that can be executed near their data sources, i.e., function-shipping. The main challenge addressed here is to ensure that the decomposed queries properly respect XML node identity and preserve structural properties, when (parts of) XML nodes are sent over the network, effectively copying them. We start by precisely characterizing the conditions, under which pass-by-value parameter passing causes semantic differences between remote execution of an XQuery expression and its local execution. We then formulate a conservative strategy that effectively avoids decomposition in such cases. To broaden the possibilities of query distribution, we extend the pass-by-value semantics to a pass-by-fragment semantics, which keeps better track of node identities and structural properties. The pass-by-fragment semantics is subsequently refined to a pass-by-projection semantics by means of a novel runtime XML projection technique, which safely eliminates most semantic differences between the local and remote execution of an XQuery expression, and strongly reduces message sizes. Finally, we discuss how these techniques can be used for updating queries, both under the standard W3C XQUF specification, as well as under an extended semantics that allows to update remote documents. The proposed techniques are implemented in XRPC, a simple yet efficient XQuery extension that enables function-shipping by adding a Remote Procedure Call mechanism to XQuery. Experiments on MonetDB/XQuery establish the performance potential of our XQuery decomposition techniques.

[1] S. Abiteboul, O. Benjelloun, B. Cautis, I. Manolescu, T. Milo, and N. Preda, "Lazy Query Evaluation for Active XML," Proc. ACM SIGMOD, 2004.
[2] V. Benzaken et al., "Type-Based XML Projection," Proc. Int'l Conf. Very Large Data Bases (VLDB), 2006.
[3] S. Boag et al., "XQuery 1.0: An XML Query Language," W3C Candidate Recommendation, June 2006.
[4] P. Boncz et al., "MonetDB/XQuery: A Fast XQuery Processor Powered by a Relational Engine," Proc. ACM SIGMOD, 2006.
[5] S. Bressan et al., "Accelerating Queries by Pruning XML Documents," Data Knowledge Eng., vol. 54, no. 2, pp. 211-240, 2005.
[6] P. Buneman et al., "Using Partial Evaluation in Distributed Query Evaluation," Proc. Int'l Conf. Very Large Data Bases (VLDB), 2006.
[7] D. Chamberlin et al., "XQuery Update Facility 1.0," W3C Candidate Recommendation, Aug. 2008.
[8] G. Cong et al., "Distributed Query Evaluation with Performance Guarantees," Proc. ACM SIGMOD, 2007.
[9] DataDirect XQuery, http:/, 2010.
[10] D. Draper et al., "XQuery 1.0 and XPath 2.0 Formal Semantics," W3C Recommendation, Jan. 2007.
[11] eXist, http:/, 2009.
[12] M. Fernández et al., "XQuery 1.0 and XPath 2.0 Data Model (XDM)," W3C Candidate Recommendation, July 2006.
[13] M. Fernández et al., "Highly Distributed XQuery with DXQ," Proc. ACM SIGMOD, 2007.
[14] T. Grust et al., "XQuery on SQL Hosts," Proc. Int'l Conf. Very Large Data Bases (VLDB), 2004.
[15] V. Josifovski et al., "Query Decomposition for a Distributed Object-Oriented Mediator System," Distributed and Parallel Databases, vol. 11, no. 3, pp. 307-336, 2002.
[16] M. Kay., "SAXON The XSLT and XQuery Processor," http:/, 2010.
[17] J. Knoop and B. Steffen, "Code Motion for Explicitly Parallel Programs," ACM SIGPLAN Notices, vol. 34, no. 8, pp. 13-24, 1999.
[18] C. Koch et al., "XML Prefiltering as a String Matching Problem," Proc. Int'l Conf. Data Eng. Conf. (ICDE), 2008.
[19] D. Kossmann, "The State of the Art in Distributed Query Processing," ACM Computing Surveys, vol. 32, no. 4, pp. 422-469, 2000.
[20] H. Kozankiewicz, K. Stencel, and K. Subieta, "Distributed Query Optimization in the Stack-Based Approach," High Performance Computing and Communications, Springer, 2005.
[21] A. Marian et al., "Projecting XML Documents," Proc. Int'l Conf. Very Large Data Bases (VLDB), 2003.
[22] N. Mitra and Y. Lafon, "SOAP Version 1.2 Part 0: Primer," second ed., W3C Recommendation, Apr. 2007.
[23] MonetDB/XQuery, http:/, 2010.
[24] N. Onose and J. Siméon, "XQuery at Your Web Service," Proc. Int'l World Wide Web Conf. (WWW), 2004.
[25] M.T. Özsu and P. Valduriez., Principles of Distributed Database Systems, second ed. Prentice-Hall, Inc., 1999.
[26] IBM DB2 pureXML, /, 2010.
[27] Qizx, http:/, 2010.
[28] C. Re et al., "Distributed XQuery," Proc. Workshop Information Integration on the Web (IIWeb), Sept. 2004.
[29] A. Schmidt et al., "XMark: A Benchmark for XML Data Management," Proc. Int'l Conf. Very Large Data Bases (VLDB), 2002.
[30] D. Suciu, "Query Decomposition and View Maintenance for Query Languages for Unstructured Data," Proc. Int'l Conf. Very Large Data Bases (VLDB), 1996.
[31] D. Suciu, "Distributed Query Evaluation on Semistructured Data," ACM Trans. Database System, vol. 27, no. 1, pp. 1-62, 2002.
[32] K. Tajima and Y. Fukui, "Answering XPath Queries Over Networks by Sending Minimal Views," Proc. Int'l Conf. Very Large Data Bases (VLDB), 2004.
[33] L.T.T. Thuy, D.D. Duong, V.C. Bhavsar, and H. Boley, "A Bottom-Up Strategy for Query Decomposition," Proc. Int'l Conf. Digital Information Management (ICDIM), 2006.
[34] E. Wong and K. Youssefi, "Decomposition—A Strategy for Query Processing," ACM Trans. Database System, vol. 1, no. 3, pp. 223-241, 1976.
[35] Web Services Atomic Transaction, com/software/developer/ libraryWS-AtomicTransaction.pdf . Aug. 2005.
[36] XQilla, http:/, 2010.
[37] C. Yu and C. Chang., "Distributed Query Processing," ACM Computing Surveys, vol. 16, no. 4, pp. 399-433, 1984.
[38] Y. Zhang and P. Boncz, "XRPC: Interoperable and Efficient Distributed XQuery," Proc. Int'l Conf. Very Large Data Bases (VLDB), 2007.
[39] Y. Zhang and P. Boncz, "Distributed XQuery and Updates Processing with Heterogeneous XQuery Engines," Proc. ACM SIGMOD, 2008.
[40] Y. Zhang, N. Tang, and P. Boncz, "Efficient Distribution of Full-Fledged XQuery," Proc. Int'l Conf. Data Eng. (ICDE), 2009.

Index Terms:
Distributed databases, XQuery decomposition.
Ying Zhang, Nan Tang, Peter Boncz, "Projective Distribution of XQuery with Updates," IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 8, pp. 1059-1076, Aug. 2010, doi:10.1109/TKDE.2010.62
Usage of this product signifies your acceptance of the Terms of Use.