This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Storing XML (with XSD) in SQL Databases: Interplay of Logical and Physical Designs
December 2005 (vol. 17 no. 12)
pp. 1595-1609
Much of business XML data has accompanying XSD specifications. In many scenarios, "shredding” such XML data into a relational storage is a popular paradigm. Optimizing evaluation of XPath queries over such XML data requires paying careful attention to both the logical and physical designs of the relational database where XML data is shredded. None of the existing solutions has taken into account physical design of the generated relational database. In this paper, we study the interplay of logical and physical design and conclude that 1) solving them independently leads to suboptimal performance and 2) there is substantial overlap between logical and physical designs: some well-known logical design transformations generate the same mappings as physical design. Furthermore, existing search algorithms are inefficient to search the extremely large space of logical and physical design combinations. We propose a search algorithm that carefully avoids searching duplicated mappings and utilizes the workload information to further prune the search space. Experimental results confirm the effectiveness of our approach.

[1] DBLP, XML records, http://dblp.uni-trier.dexml/, 2005.
[2] S. Agrawal, S. Chaudhuri, and V.R. Narasayya, “Automated Selection of Materialized Views and Indexes in SQL Databases,” Proc. Very Large Data Bases Conf., 2000.
[3] S. Agrawal, V.R. Narasayya, and B. Yang, “Integrating Vertical and Horizontal Partitioning into Automated Physical Database Design,” Proc. ACM SIGMOD, pp. 359-370, 2004.
[4] S. Banerjee, V. Krishnamurthy, M. Krishnaprasad, and R. Murthy, “Oracle8i— The XML Enabled Data Management System,” Proc. Int'l Conf. Data Eng., 2000.
[5] P. Bohannon, J. Freire, P. Roy, and J. Simeon, “From XML Schema to Relations: A Cost-Based Approach to XML Storage,” Proc. Int'l Conf. Data Eng., 2002.
[6] N. Bruno, N. Koudas, and D. Srivastava, “Holistic Twig Joins: Optimal XML Pattern Matching,” Proc. ACM SIGMOD, 2002.
[7] S. Chaudhuri and V.R. Narasayya, “An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server,” Proc. Very Large Data Bases Conf., 1997.
[8] J.M. Cheng and J. Xu, “XML and DB2,” Proc. Int'l Conf. Data Eng., 1999.
[9] V. Christophides, S. Abiteboul, S. Cluet, and M. Scholl, “From Structured Documents to Novel Query Facilities,” Proc. ACM SIGMOD Conf., 1994.
[10] A. Deutsch, M.F. Fernandez, and D. Suciu, “Storing Semistructured Data with STORED,” Proc. ACM SIGMOD Conf., 1999.
[11] M.F. Fernandez, A. Morishima, and D. Suciu, “Efficient Evaluation of XML Middle-Ware Queries,” Proc. ACM SIGMOD Conf., 2001.
[12] D. Florescu and D. Kossmann, “Storing and Querying XML Data Using an RDBMS,” IEEE Data Eng. Bull., 1999.
[13] J. Freire, J.R. Haritsa, M. Ramanath, P. Roy, and J. Simon, “StatiX: Making XML Count,” Proc. ACM SIGMOD Conf., pp. 181-191, 2002.
[14] H.V. Jagadish, S. Al-Khalifa, A. Chapman, L.V.S. Lakshmanan, A. Nierman, S. Paparizos, J.M. Patel, D. Srivastava, N. Wiwatwattana, Y. Wu, and C. Yu, “Timber: A Native XML Database,” VLDB J., vol. 11, no. 4, 2002.
[15] M. Klettke and H. Meyer, “XML and Object-Relational Database Systems— Enhancing Structural Mappings Based on Statistics,” Proc. Third Int'l Workshop Web and Databases, 2000.
[16] R. Krishnamurthy, V.T. Chakaravarthy, and J.F. Naughton, “Difficulty of Finding Optimal Relational Decompositions for XML Workloads: A Complexity Theoretic Perspective,” Proc. Int'l Conf. Database Theory, 2003.
[17] R. Krishnamurthy, R. Kaushik, and J.F. Naughton, “Efficient XML-to-SQL Query Translation: Where to Add the Intelligence?” Proc. Very Large Data Bases Conf., pp. 144-155, 2004.
[18] M. Ramanath, J. Freire, J.R. Haritsa, and P. Roy, “Searching for Efficient XML-to-Relational Mappings,” Xsym, pp. 19-36, 2003.
[19] A. Schmidt, M.L. Kersten, M. Windhouwer, and F. Waas, “Efficient Relational Storage and Retrieval of XML Documents,” Proc. Third Int'l Workshop Web and Databases, 2000.
[20] J. Shanmugasundaram, G. He, K. Tufte, C. Zhang, D. DeWitt, and J. Naughton, “Relational Databases for Querying XML Documents: Limitations and Opportunities,” Proc. Very Large Data Bases Conf., 1999.
[21] J. Shanmugasundaram, E.J. Shekita, R. Barr, M.J. Carey, B.G. Lindsay, H. Pirahesh, and B. Reinwald, “Efficiently Publishing Relational Data as XML Documents,” Proc. Very Large Data Bases Conf., 2000.
[22] T. Shimura, M. Yoshikawa, and S. Uemura, “Storage and Retrieval of XML Documents Using Object-Relational Databases,” Proc. 10th Int'l Conf. and Workshop Database and Expert Systems Applications, 1999.
[23] G. Valentin, M. Zuliani, D.C. Zilio, G.M. Lohman, and A. Skelley, “DB2 Advisor: An Optimizer Smart Enough to Recommend Its Own Indexes,” Proc. Int'l Conf. Data Eng., 2000.
[24] H. Wang, S. Park, W. Fan, and P.S. Yu., “Vist: A Dynamic Index Method for Querying XML Data by Tree Structures,” Proc. ACM SIGMOD Conf., 2003.
[25] World Wide Web Consortium, XML Schema, 2001, http://www.w3.org/XMLSchema.

Index Terms:
Index Terms- XML, physical design, relational databases.
Citation:
Surajit Chaudhuri, Zhiyuan Chen, Kyuseok Shim, Yuqing Wu, "Storing XML (with XSD) in SQL Databases: Interplay of Logical and Physical Designs," IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 12, pp. 1595-1609, Dec. 2005, doi:10.1109/TKDE.2005.204
Usage of this product signifies your acceptance of the Terms of Use.