This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Efficiently Querying Large XML Data Repositories: A Survey
October 2007 (vol. 19 no. 10)
pp. 1381-1403
Extensible Markup Language (XML) is emerging as a de facto standard for information exchange among various applications on the World-Wide Web. There has been a growing need for developing high-performance techniques to query large XML data repositories efficiently. One important problem in XML query processing is twig pattern matching , that is, finding in an XML data tree D all matches that satisfy a specified twig (or path) query pattern Q. In this survey we review, classify, and compare major techniques for twig pattern matching.Specifically, we consider two classes of major XML queryprocessing techniques: the relational approach and the native approach. The relational approach directly utilizes existing relational database systems to store and query XML data, which enables the use of all important techniques that have been developed for relational databases, while in the native approach, specialized storage and query-processing systems tailored for XML data are developed from scratch to further improve XML query performance. As implied by existing work, XML data querying and management are developing in the direction of integrating the relational approach with the native approach, which could result in higher query-processing performance and also significantly reduce system-reengineering costs.

[1] S. Abiteboul, P. Buneman, and D. Suciu, Data on the Web: From Relations to Semistructured Data and XML. Morgan Kaufmann, 1999.
[2] R. Agrawal, A. Borgida, and H.V. Jagadish, “Management of Transitive Relationships in Large Data and Knowledge Bases,” Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '89), 1989.
[3] S. Al-Khalifa, H.V. Jagadish, N. Koudas, J.M. Patel, D. Srivastava, and Y. Wu, “Structural Joins: A Primitive for Efficient XML Query Pattern Matching,” Proc. 18th IEEE Int'l Conf. Data Eng. (ICDE '02), 2002.
[4] M. Altinel and M.J. Franklin, “Efficient Filtering of XML Documents for Selective Dissemination of Information,” Proc. 26th Int'l Conf. Very Large Data Bases (VLDB '00), 2000.
[5] S. Amer-Yahia, L.V.S. Lakshmanan, and S. Pandit, “FleXPath: Flexible Structure and Full-Text Querying for XML,” Proc. 23rd ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '04), 2004.
[6] C. Barton, P. Charles, D. Goyal, M. Raghavachari, M. Fontoura, and V. Josifovski, “Streaming XPath Processing with Forward and Backward Axes,” Proc. 19th IEEE Int'l Conf. Data Eng. (ICDE '03), 2003.
[7] K.S. Beyer, R. Cochrane, V. Josifovski, J. Kleewein, G. Lapis, G.M. Lohman, B. Lyle, F. Özcan, H. Pirahesh, N. Seemann, T.C. Truong, B. Van der Linden, B. Vickery, and C. Zhang, “System RX: One Part Relational, One Part XML,” Proc. 24th ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '05), 2005.
[8] N. Bruno, L. Gravano, N. Koudas, and D. Srivastava, “Navigation- vs. Index-Based XML Multi-Query Processing,” Proc. 19th IEEE Int'l Conf. Data Eng. (ICDE '03), 2003.
[9] N. Bruno, N. Koudas, and D. Srivastava, “Holistic Twig Joins: Optimal XML Pattern Matching,” Proc. 21st ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '02), 2002.
[10] Online Computer Library Center, Dewey Decimal Classification, http://www.oclc.orgdewey/, 2006.
[11] D.D. Chamberlin, “XQuery: An XML Query Language,” IBM Systems J., vol. 41, no. 4, 2002.
[12] C.Y. Chan, P. Felber, M.N. Garofalakis, and R. Rastogi, “Efficient Filtering of XML Documents with XPath Expressions,” Proc. 18th IEEE Int'l Conf. Data Eng. (ICDE '02), 2002.
[13] Q. Chen, A. Lim, and K.W. Ong, “D(K)-Index: An Adaptive Structural Summary for Graph-Structured Data,” Proc. 22nd ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '03), 2003.
[14] T. Chen, T.W. Ling, and C.Y. Chang, “Prefix Path Streaming: A New Clustering Method for Optimal Holistic XML Twig Pattern Matching,” Proc. 15th Int'l Conf. Database and Expert Systems Applications (DEXA '04), 2004.
[15] T. Chen, J. Lu, and T.W. Ling, “On Boosting Holism in XML Twig Pattern Matching Using Structural Indexing Techniques,” Proc. 24th ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '05), 2005.
[16] Y. Chen, S.B. Davidson, and Y. Zheng, “BLAS: An Efficient XPath Processing System,” Proc. 23rd ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '04), 2004.
[17] Y. Chen, S.B. Davidson, and Y. Zheng, “An Efficient XPath Query Processor for XML Streams,” Proc. 22nd IEEE Int'l Conf. Data Eng. (ICDE '06), 2006.
[18] Z. Chen, H.V. Jagadish, L.V.S. Lakshmanan, and S. Paparizos, “From Tree Patterns to Generalized Tree Patterns: On Efficient Evaluation of XQuery,” Proc. 29th Int'l Conf. Very Large Data Bases (VLDB '03), 2003.
[19] Z. Chen, J. Gehrke, F. Korn, N. Koudas, J. Shanmugasundaram, and D. Srivastava, “Index Structures for Matching XML Twigs Using Relational Query Processors,” Proc. Second Int'l Workshop XML Schema and Data Management (XSDM '05), 2005.
[20] S.-Y. Chien, Z. Vagena, D. Zhang, V.J. Tsotras, and C. Zaniolo, “Efficient Structural Joins on Indexed XML Documents,” Proc. 28th Int'l Conf. Very Large Data Bases (VLDB '02), 2002.
[21] B. Choi, “What Are Real DTDs Like,” Proc. Fifth Int'l Workshop Web and Databases (WebDB '02), 2002.
[22] B. Choi, M. Mahoui, and D. Wood, “On the Optimality of Holistic Algorithms for Twig Queries,” Proc. 14th Int'l Workshop Database and Expert Systems Applications (DEXA '03), 2003.
[23] C.-W. Chung, J.-K. Min, and K. Shim, “APEX: An Adaptive Path Index for XML Data,” Proc. 21st ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '02), 2002.
[24] E. Cohen, E. Halperin, H. Kaplan, and U. Zwick, “Reachability and Distance Queries via 2-Hop Labels,” SIAM J. Computing, vol. 32, pp. 1338-1355, 2003.
[25] W3C Consortium, http:/www.w3.org, 2006.
[26] W3C Consortium, Guide to the W3C XML Specification (XMLspec) DTD, Version 2.1, http://www.w3.org/XML/1998/06xmlspec-report.htm , 2006.
[27] W3C Consortium, XML Path Language (XPath) 2.0, http://www.w3.org/TRxpath20/, 2006.
[28] W3C Consortium, XML Query Use Cases, http://www.w3.org/TRxquery-use-cases/, 2006.
[29] W3C Consortium, XML Schema, http://www.w3.org/XMLSchema, 2006.
[30] W3C Consortium, XQuery 1.0: An XML Query Language, http://www.w3.org/TRxquery/, 2006.
[31] B. Cooper, N. Sample, M.J. Franklin, G.R. Hjaltason, and M. Shadmon, “A Fast Index for Semistructured Data,” Proc. 27th Int'l Conf. Very Large Data Bases (VLDB '01), 2001.
[32] D. DeHaan, D. Toman, M.P. Consens, and M.T. Ozsu, “A Comprehensive XQuery to SQL Translation Using Dynamic Interval Encoding,” Proc. 22nd ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '03), 2003.
[33] A. Deutsch, Y. Papakonstantinou, and Y. Xu, “The NEXT Logical Framework for XQuery,” Proc. 30th Int'l Conf. Very Large Data Bases (VLDB '04), 2004.
[34] Y. Diao, M. Altinel, M.J. Franklin, H. Zhang, and P.M. Fischer, “Path Sharing and Predicate Evaluation for High-Performance XML Filtering,” ACM Trans. Database Systems, vol. 28, pp. 467-516, 2003.
[35] Y. Diao and M.J. Franklin, “Query Processing for High-Volume XML Message Brokering,” Proc. 29th Int'l Conf. Very Large Data Bases (VLDB '03), 2003.
[36] P.F. Dietz, “Maintaining Order in a Linked List,” Proc. 14th ACM Symp. Theory of Computing, 1982.
[37] M. Fernandez et al., “Galax: An Implementation of XQuery,” http:/www.galaxquery.org/, 2006.
[38] T. Fiebig, S. Helmer, C.-C. Kanne, G. Moerkotte, J. Neumann, R. Schiele, and T. Westmann, “Anatomy of a Native XML Base Management System,” VLDB J., vol. 11, no. 4, pp. 292-314, 2002.
[39] D. Florescu, C. Hillery, D. Kossmann, P. Lucas, F. Riccardi, T. Westmann, M.J. Carey, A. Sundararajan, and G. Agrawal, “The BEA/XQRL Streaming XQuery Processor,” Proc. 29th Int'l Conf. Very Large Data Bases (VLDB '03), 2003.
[40] D. Florescu and D. Kossmann, “A Performance Evaluation of Alternative Mapping Schemes for Storing XML Data in a Relational Database,” Technical Report 3684, INRIA, 1999.
[41] D. Florescu and D. Kossmann, “Storing and Querying XML Data Using an RDMBS,” IEEE Data Eng. Bull., vol. 22, pp. 27-34, 1999.
[42] M. Fontoura, V. Josifovski, E.J. Shekita, and B. Yang, “Optimizing Cursor Movement in Holistic Twig Joins,” Proc. 14th Int'l Conf. Information and Knowledge Management (CIKM '05), 2005.
[43] N. Fuhr and K. GroBjohann, “XIRQL: A Query Language for Information Retrieval in XML Documents,” Proc. 24th ACM Int'l Conf. Research and Development in Information Retrieval (SIGIR '01), 2001.
[44] R. Goldman and J. Widom, “DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases,” Proc. 23rd Int'l Conf. Very Large Data Bases (VLDB '97), 1997.
[45] G. Gottlob, C. Koch, and R. Pichler, “Efficient Algorithms for Processing XPath Queries,” Proc. 28th Int'l Conf. Very Large Data Bases (VLDB '02), 2002.
[46] G. Gottlob, C. Koch, and R. Pichler, “The Complexity of XPath Query Evaluation,” Proc. 22nd ACM Symp. Principles of Database Systems (PODS '03), 2003.
[47] G. Gottlob, C. Koch, and R. Pichler, “XPath Query Evaluation: Improving Time and Space Efficiency,” Proc. 19th IEEE Int'l Conf. Data Eng. (ICDE '03), 2003.
[48] G. Gottlob, C. Koch, and R. Pichler, “Efficient Algorithms for Processing XPath Queries,” ACM Trans. Database Systems, vol. 30, no. 2, pp. 444-491, 2005.
[49] T.J. Green, A. Gupta, G. Miklau, M. Onizuka, and D. Suciu, “Processing XML Streams with Deterministic Automata and Stream Indexes,” ACM Trans. Database Systems, vol. 29, pp. 752-788, 2004.
[50] T. Grust, “Accelerating XPath Location Steps,” Proc. 21st ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '02), 2002.
[51] T. Grust, S. Sakr, and J. Teubner, “XQuery on SQL Hosts,” Proc. 30th Int'l Conf. Very Large Data Bases (VLDB '04), 2004.
[52] T. Grust, M. van Keulen, and J. Teubner, “Staircase Join: Teach a Relational DBMS to Watch Its (Axis) Steps,” Proc. 29th Int'l Conf. Very Large Data Bases (VLDB '03), 2003.
[53] T. Grust, M. van Keulen, and J. Teubner, “Accelerating XPath Evaluation in Any RDBMS,” ACM Trans. Database Systems, vol. 29, pp. 91-131, 2004.
[54] L. Guo, F. Shao, C. Botev, and J. Shanmugasundaram, “XRANK: Ranked Keyword Search over XML Documents,” Proc. 29th ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '03), 2003.
[55] A.K. Gupta and D. Suciu, “Stream Processing of XPath Queries with Predicates,” Proc. 29th ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '03), 2003.
[56] A. Halverson, J. Burger, L. Galanis, A. Kini, R. Krishnamurthy, A.N. Rao, F. Tian, S. Viglas, Y. Wang, J.F. Naughton, and D.J. DeWitt, “Mixed Mode XML Query Processing,” Proc. 29th Int'l Conf. Very Large Data Bases (VLDB '03), 2003.
[57] H. He, H. Wang, J. Yang, and P.S. Yu, “Compact Reachability Labeling for Graph-Structured Data,” Proc. 14th Int'l Conf. Information and Knowledge Management (CIKM '05), 2005.
[58] H. He and J. Yang, “Multiresolution Indexing of XML for Frequent Queries,” Proc. 20th IEEE Int'l Conf. Data Eng. (ICDE '04), 2004.
[59] IBM, http://www-306.ibm.com/software/data/db2 9/, 2006.
[60] H.V. Jagadish, S. Al-Khalifa, A. Chapman, L.V.S. Lakshmanan, A. Nierman, S. Paparizos, J.M. Patel, D. Srivastava, N. Wiwatwattana, Y. Wu, and C. Yu, “TIMBER: A Native XML Database,” VLDB J., vol. 11, pp. 274-291, 2002.
[61] H.V. Jagadish, L.V.S. Lakshmanan, D. Srivastava, and K. Thompson, “TAX: A Tree Algebra for XML,” Proc. Eighth Int'l Workshop Databases and Programming Languages (DBPL '01), 2001.
[62] H. Jiang, H. Lu, and W. Wang, “Efficient Processing of Twig Queries with OR-Predicates,” Proc. 23rd ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '04), 2004.
[63] H. Jiang, H. Lu, W. Wang, and B.C. Ooi, “XR-Tree: Indexing XML Data for Efficient Structural Joins,” Proc. 19th IEEE Int'l Conf. Data Eng. (ICDE '03), 2003.
[64] H. Jiang, H. Lu, W. Wang, and J.X. Yu, “Path Materialization Revisited: An Efficient Storage Model for XML Data,” Proc. 13th Australasian Database Conf. (ADC '02) , 2002.
[65] H. Jiang, W. Wang, H. Lu, and J.X. Yu, “Holistic Twig Joins on Indexed XML Documents,” Proc. 29th Int'l Conf. Very Large Data Bases (VLDB '03), 2003.
[66] V. Josifovski, M. Fontoura, and A. Barta, “Querying XML Streams,” VLDB J., vol. 14, no. 2, pp. 197-210, 2005.
[67] R. Kaushik, P. Bohannon, J.F. Naughton, and H.F. Korth, “Covering Indexes for Branching Path Queries,” Proc. 21st ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '02), 2002.
[68] R. Kaushik, R. Krishnamurthy, J.F. Naughton, and R. Ramakrishnan, “On the Integration of Structure Indexes and Inverted Lists,” Proc. 23rd ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '04), 2004.
[69] R. Kaushik, P. Shenoy, P. Bohannon, and E. Gudes, “Exploiting Local Similarity for Indexing Paths in Graph-Structured Data,” Proc. 18th IEEE Int'l Conf. Data Eng. (ICDE '02), 2002.
[70] M.H. Kay, “SAXON: The XSLT and XQuery Processor,” http:/saxon.sourceforge.net/, 2006.
[71] C. Koch, S. Scherzinger, N. Schweikardt, and B. Stegmaier, “Schema-Based Scheduling of Event Processors and Buffer Minimization for Queries on Structured Data Streams,” Proc. 30th Int'l Conf. Very Large Data Bases (VLDB '04), 2004.
[72] R. Krishnamurthy, R. Kaushik, and J.F. Naughton, “XML-SQL Query Translation Literature: The State of the Art and Open Problems,” Proc. First Int'l XML Database Symp. (XSym '03), 2003.
[73] H. Li, M.-L. Lee, W. Hsu, and C. Chen, “An Evaluation of XML Indexes for Structural Join,” SIGMOD Record, vol. 33, no. 3, pp. 28-33, 2004.
[74] Q. Li and B. Moon, “Indexing and Querying XML Data for Regular Path Expressions,” Proc. 27th Int'l Conf. Very Large Data Bases (VLDB '01), 2001.
[75] X. Li and G. Agrawal, “Efficient Evaluation of XQuery over Streaming Data,” Proc. 31st Int'l Conf. Very Large Data Bases (VLDB '05), 2005.
[76] Y. Li, C. Yu, and H.V. Jagadish, “Schema-Free XQuerys,” Proc. 30th Int'l Conf. Very Large Data Bases (VLDB '04), 2004.
[77] S. Liu, Q. Zou, and W.W. Chu, “Configurable Indexing and Ranking for XML Information Retrieval,” Proc. 27th Int'l ACM Conf. Research and Development in Information Retrieval (SIGIR '04), 2004.
[78] J. Lu, T. Chen, and T.W. Ling, “Efficient Processing of XML Twig Patterns with Parent Child Edges: A Look-Ahead Approach,” Proc. 13th Int'l Conf. Information and Knowledge Management (CIKM '04), 2004.
[79] J. Lu, T.W. Ling, C.Y. Chan, and T. Chen, “From Region Encoding to Extended Dewey: On Efficient Processing of XML Twig Pattern Matching,” Proc. 31st Int'l Conf. Very Large Data Bases (VLDB '05), 2005.
[80] B. Ludascher, P. Mukhopadhyay, and Y. Papakonstantinou, “A Transducer-Based XML Query Processor,” Proc. 28th Int'l Conf. Very Large Data Bases (VLDB '02), 2002.
[81] J. McHugh, S. Abiteboul, R. Goldman, D. Quass, and J. Widom, “Lore: A Database Management System for Semistructured Data,” SIGMOD Record, vol. 26, no. 3, pp. 54-66, 1997.
[82] J. McHugh and J. Widom, “Query Optimization for XML,” Proc. 25th Int'l Conf. Very Large Data Bases (VLDB '99), 1999.
[83] R. Milner, “A Calculus for Communicating Processes,” Lecture Notes in Computer Science 92, Springer-Verlag, 1980.
[84] T. Milo and D. Suciu, “Index Structures for Path Expressions,” Proc. Seventh Int'l Conf. Database Theory (ICDT '99), 1999.
[85] J.-K. Min, M.-J. Park, and C.-W. Chung, “XPRESS: A Queriable Compression for XML Data,” Proc. 22nd ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '03), 2003.
[86] M.M. Moro, Z. Vagena, and V.J. Tsotras, “Tree-Pattern Queries on a Lightweight XML Processor,” Proc. 31st Int'l Conf. Very Large Data Bases (VLDB '05), 2005.
[87] R. Murthy, Z.H. Liu, M. Krishnaprasad, S. Chandrasekar, A.-T. Tran, E. Sedlar, D. Florescu, S. Kotsovolos, N. Agarwal, V. Arora, and V. Krishnamurthy, “Towards an Enterprise XML Architecture,” Proc. 24th ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '05), 2005.
[88] P.E. O'Neil, E.J. O'Neil, S. Pal, I. Cseri, G. Schaller, and N. Westbury, “ORDPATHs: Insert-Friendly XML Node Labels,” Proc. 23rd ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '04), 2004.
[89] SAX Project Organization, SAX: Simple API for XML, http:/www.saxproject.org/, 2004.
[90] S. Pal, I. Cseri, G. Schaller, O. Seeliger, L. Giakoumakis, and V. Zolotov, “Indexing XML Data Stored in a Relational Database,” Proc. 30th Int'l Conf. Very Large Data Bases (VLDB '04), 2004.
[91] S. Paparizos, Y. Wu, L.V.S. Lakshmanan, and H.V. Jagadish, “Tree Logical Classes for Efficient Evaluation of XQuery,” Proc. 23rd ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '04), 2004.
[92] D. Park, “Concurrency and Automata on Infinite Sequences,” Proc. Fifth GI Conf. Theoretical Computer Science, pp. 167-183, 1981.
[93] F. Peng and S.S. Chawathe, “XPath Queries on Streaming Data,” Proc. 22nd ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '03), 2003.
[94] F. Peng and S.S. Chawathe, “XSQ: A Streaming XPath Engine,” ACM Trans. Database Systems, vol. 30, pp. 577-623, 2005.
[95] H. Prufer, “Neuer Beweis Eines Satzes Uber Permutationen,” Archiv fur Mathematik und Physik, vol. 27, pp. 142-144, 1918.
[96] P. Ramanan, “Covering Indexes for XML Queries: Bisimulation-Simulation = Negation,” Proc. 29th Int'l Conf. Very Large Data Bases (VLDB '03), 2003.
[97] P. Rao and B. Moon, “PRIX: Indexing and Querying XML Using Prufer Sequences,” Proc. 20th IEEE Int'l Conf. Data Eng. (ICDE '04), 2004.
[98] F. Rizzolo and A.O. Mendelzon, “Indexing XML Data with ToXin,” Proc. Fourth Int'l Workshop Web and Databases (WebDB '01), 2001.
[99] G. Salton and M.J. McGill, Introduction to Modern Information Retrieval. McGraw-Hill, 1983.
[100] R. Schenkel, A. Theobald, and G. Weikum, “HOPI: An Efficient Connection Index for Complex XML Document Collections,” Proc. Ninth Int'l Conf. Extending Database Technology (EDBT '04), 2004.
[101] R. Schenkel, A. Theobald, and G. Weikum, “Efficient Creation and Incremental Maintenance of the HOPI Index for Complex XML Document Collections,” Proc. 21st IEEE Int'l Conf. Data Eng. (ICDE '05), 2005.
[102] A. Schmidt, M.L. Kersten, M. Windhouwer, and F. Waas, “Efficient Relational Storage and Retrieval of XML Documents,” Proc. Third Int'l Workshop Web and Databases (WebDB '00), 2000.
[103] P.G. Selinger, M.M. Astrahan, D.D. Chamberlin, R.A. Lorie, and T.G. Price, “Access Path Selection in a Relational Database Management System,” Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '79), 1979.
[104] J. Shanmugasundaram, E.J. Shekita, J. Kiernan, R. Krishnamurthy, S. Viglas, J.F. Naughton, and I. Tatarinov, “A General Technique for Querying XML Documents Using a Relational Database System,” SIGMOD Record, vol. 30, pp. 20-26, 2001.
[105] J. Shanmugasundaram, K. Tufte, C. Zhang, G. He, D.J. DeWitt, and J.F. Naughton, “Relational Databases for Querying XML Documents: Limitations and Opportunities,” Proc. 25th Int'l Conf. Very Large Data Bases (VLDB '99), 1999.
[106] A. Silberstein, H. He, K. Yi, and J. Yang, “BOXes: Efficient Maintenance of Order-Based Labeling for Dynamic XML Data,” Proc. 21st IEEE Int'l Conf. Data Eng. (ICDE '05), 2005.
[107] H. Su, E.A. Rundensteiner, and M. Mani, “Semantic Query Optimization for XQuery over XML Streams,” Proc. 31st Int'l Conf. Very Large Data Bases (VLDB '05), 2005.
[108] I. Tatarinov, S. Viglas, K.S. Beyer, J. Shanmugasundaram, E.J. Shekita, and C. Zhang, “Storing and Querying Ordered XML Using a Relational Database System,” Proc. 21st ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '02), 2002.
[109] F. Tian, D.J. DeWitt, J. Chen, and C. Zhang, “The Design and Performance Evaluation of Alternative XML Storage Strategies,” SIGMOD Record, vol. 31, no. 1, pp. 5-10, 2002.
[110] Z. Vagena, M.M. Moro, and V.J. Tsotras, “Twig Query Processing over Graph-Structured XML Data,” Proc. Seventh Int'l Workshop Web and Databases (WebDB '04), 2004.
[111] H. Wang, H. He, J. Yang, P.S. Yu, and J.X. Yu, “Dual Labeling: Answering Graph Reachability Queries in Constant Time,” Proc. 22nd IEEE Int'l Conf. Data Eng. (ICDE '06), 2006.
[112] H. Wang and X. Meng, “On the Sequencing of Tree Structures for XML Indexing,” Proc. 21st IEEE Int'l Conf. Data Eng. (ICDE '05), 2005.
[113] H. Wang, S. Park, W. Fan, and P.S. Yu, “ViST: A Dynamic Index Method for Querying XML Data by Tree Structures,” Proc. 29th ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '03), 2003.
[114] W. Wang, H. Jiang, H. Lu, and J.X. Yu, “Containment Join Size Estimation: Models and Methods,” Proc. 22nd ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '03), 2003.
[115] W. Wang, H. Jiang, H. Lu, and J.X. Yu, “PBiTree Coding and Efficient Processing of Containment Joins,” Proc. 19th IEEE Int'l Conf. Data Eng. (ICDE '03), 2003.
[116] W. Wang, H. Wang, H. Lu, H. Jiang, X. Lin, and J. Li, “Efficient Processing of XML Path Queries Using the Disk-Based F&B Index,” Proc 31st Int'l Conf. Very Large Data Bases (VLDB '05), 2005.
[117] F. Weigel, H. Meuss, F. Bry, and K.U. Schulz, “Content-Aware DataGuides: Interleaving IR and DB Indexing Techniques for Efficient Retrieval of Textual XML Data,” Proc. 26th European Conf. IR Research (ECIR '04), 2004.
[118] F. Weigel, K.U. Schulz, and H. Meuss, “The BIRD Numbering Scheme for XML and Tree Databases—Deciding and Reconstructing Tree Relations Using Efficient Arithmetic Operations,” Proc. Third Int'l XML Database Symp. (XSym '05), 2005.
[119] Y. Wu, J.M. Patel, and H.V. Jagadish, “Estimating Answer Sizes for XML Queries,” Proc. Eighth Int'l Conf. Extending Database Technology (EDBT '02), 2002.
[120] Y. Wu, J.M. Patel, and H.V. Jagadish, “Structural Join Order Selection for XML Query Optimization,” Proc. 19th IEEE Int'l Conf. Data Eng. (ICDE '03), 2003.
[121] Y. Xu and Y. Papakonstantinou, “Efficient Keyword Search for Smallest LCAs in XML Databases,” Proc. 24th ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '05), 2005.
[122] B. Yang, M. Fontoura, E.J. Shekita, S. Rajagopalan, and K.S. Beyer, “Virtual Cursors for XML Joins,” Proc. 13th Int'l Conf. Information and Knowledge Management (CIKM '04), 2004.
[123] M. Yoshikawa, T. Amagasa, T. Shimura, and S. Uemura, “XRel: A Path-Based Approach to Storage and Retrieval of XML Documents Using Relational Databases,” ACM Trans. Internet Technology, vol. 1, pp. 110-141, 2001.
[124] C. Zhang, J.F. Naughton, D.J. DeWitt, Q. Luo, and G.M. Lohman, “On Supporting Containment Queries in Relational Database Management Systems,” Proc. 20th ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '01), 2001.
[125] N. Zhang, V. Kacholia, and M.T. Ozsu, “A Succinct Physical Storage Scheme for Efficient Evaluation of Path Queries in XML,” Proc. 20th IEEE Int'l Conf. Data Eng. (ICDE '04), 2004.

Index Terms:
XML query processing, twig pattern matching
Citation:
Gang Gou, Rada Chirkova, "Efficiently Querying Large XML Data Repositories: A Survey," IEEE Transactions on Knowledge and Data Engineering, vol. 19, no. 10, pp. 1381-1403, Oct. 2007, doi:10.1109/TKDE.2007.1060
Usage of this product signifies your acceptance of the Terms of Use.