loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2008 19th International Conference on Database and Expert Systems Application
An Entropy-Based Characterization of the Heterogeneity of XML Collections
September 01-September 05
ISBN: 978-0-7695-3299-8
The concept of heterogeneity is very important in XML data management, since many common applications must deal with large and complex collections which do not conform to a schema. Heterogeneity in XML collections can be present at many different levels (textual and structural) and needs to be addressed from several perspectives. This paper contributes a formal characterization of heterogeneity in XML collections based on information-theoretic considerations. We show how it can be applied in some important use cases, and we demonstrate its effectiveness by using it to analyze a number of relevant XML collections and retrieval approaches found in the literature. We show that a large space of highly heterogeneous collections has not??been adequately addressed by these approaches.
Index Terms:
XML, collection characterization, heterogeneous data, entropy
Citation:
Ismael Sanz, Marco Mesiti, Giovanna Guerrini, Rafael Berlanga, "An Entropy-Based Characterization of the Heterogeneity of XML Collections," dexa, pp.238-242, 2008 19th International Conference on Database and Expert Systems Application, 2008
Usage of this product signifies your acceptance of the Terms of Use.