loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
First IEEE International Conference on Data Mining (ICDM'01)
Preparations for Semantics-Based XML Mining
San Jose, California
November 29-December 02
ISBN: 0-7695-1119-8
XML allows users to define elements using arbitrary words and organize them in a nested structure. These features of XML offer both challenges and opportunities in information retrieval, document management, and data mining. In this paper,we propose a new methodology for preparing XML documents for quantitative determination of similarity between XML documents by taking account of XML semantics (i.e.,meanings of the elements and nested structures of XML documents).Accurate quantitative determination of similarity between XML documents provides an important basis for a variety of applications of XML document mining and processing. Experiments with XML documents show that our methodology provides a 50-100%improvement in determining similarity, over the traditional vector-space model that considers only term-frequency and 100% accuracy in identifying the category of each document from an on-line bookstore.
Citation:
Jung-Won Lee, Kiho Lee, Won Kim, "Preparations for Semantics-Based XML Mining," icdm, pp.345, First IEEE International Conference on Data Mining (ICDM'01), 2001
Usage of this product signifies your acceptance of the Terms of Use.