loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
20th International Conference on Data Engineering (ICDE'04)
Multiresolution Indexing of XML for Frequent Queries
Boston, Massachusetts
March 30-April 02
ISBN: 0-7695-2065-0
Hao He, Duke University, Durham
Jun Yang, Duke University, Durham
XML and other types of semi-structured data are typically represented by a labeled directed graph. To speed up path expression queries over the graph, a variety of structural indexes have been proposed. They usually work by partitioning nodes in the data graph into equivalence classes and storing equivalence classes as index nodes. A(k)-index introduces the concept of local bisimilarity for partitioning, allowing the trade-off between index size and query answering power. However, all index nodes in A(k)-index have the same local similarity k, which cannot take advantage of the fact that a workload may contain path expressions of different lengths, or that different parts of the data graph may have different local similarity requirements.
To overcome these limitations, we propose M(k)- and M*(k)-indexes. The basic M(k)-index is workload-aware: Like the previously proposed D(k)-index, it allows different index nodes to have different local similarity requirements, providing finer partitioning only for parts of the data graph targeted by longer path expressions. Unlike D(k)-index, M(k)-index is never over-refined for irrelevant index or data nodes. However, the workload-aware feature still incurs overrefinement due to over-qualified parent index nodes. Moreover, fine partitions penalize the performance of short path expressions. To solve these problems, we further propose the M*(k)-index. An M*(k)-index consists of a collection of indexes whose nodes are organized in a partition hierarchy, allowing successively coarser partitioning information to co-exist with the finest partitioning information required. Experiments show that our indexes are superior to previously proposed indexes in terms of index size and query performance.
Citation:
Hao He, Jun Yang, "Multiresolution Indexing of XML for Frequent Queries," icde, pp.683, 20th International Conference on Data Engineering (ICDE'04), 2004
Usage of this product signifies your acceptance of the Terms of Use.