The Community for Technology Leaders
2012 23rd International Workshop on Database and Expert Systems Applications (2008)
Sept. 1, 2008 to Sept. 5, 2008
ISSN: 1529-4188
ISBN: 978-0-7695-3299-8
pp: 238-242
ABSTRACT
The concept of heterogeneity is very important in XML data management, since many common applications must deal with large and complex collections which do not conform to a schema. Heterogeneity in XML collections can be present at many different levels (textual and structural) and needs to be addressed from several perspectives. This paper contributes a formal characterization of heterogeneity in XML collections based on information-theoretic considerations. We show how it can be applied in some important use cases, and we demonstrate its effectiveness by using it to analyze a number of relevant XML collections and retrieval approaches found in the literature. We show that a large space of highly heterogeneous collections has not??been adequately addressed by these approaches.
INDEX TERMS
XML, collection characterization, heterogeneous data, entropy
CITATION
Giovanna Guerrini, Ismael Sanz, Marco Mesiti, Rafael Berlanga, "An Entropy-Based Characterization of the Heterogeneity of XML Collections", 2012 23rd International Workshop on Database and Expert Systems Applications, vol. 00, no. , pp. 238-242, 2008, doi:10.1109/DEXA.2008.55
87 ms
(Ver 3.3 (11022016))