loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2004 Symposium on Applications and the Internet (SAINT'04)
Change Discovery of Hierarchically Structured, Order-Sensitive Data in HTML/XML Documents
Tokyo, Japan
January 26-January 30
ISBN: 0-7695-2068-5
SeungJin Lim, Utah State University
Yiu-Kai Ng, Brigham Young University
As hierarchically structured, order-sensitive HTML/XML data become more prevailing in on-line data exchange and processing, discovering changes in these data is essential in Web data processing, especially when they evolve frequently over time. We propose a change-discovery algorithm (CDA) for any two HTML/XML documents, each of which is hierarchically structured and represented as an ordered tree. The novelties of CDA include (i) the usage of weighted sequence difference to determine the edit script with the anticipated minimal operational cost and (ii) the generation of the minimal contextual differences of branches in the two given trees. Differed from existing change-detection approaches that adopt node-to-node comparisons, CDA adopts branch-to-branch comparisons. Using CDA, generated edit scripts can be processed in any order to yield the same results, which enhances parallelism. CDA also guarantees lossless, reversal transformation. The time complexity of CDA is polynomial, which is proportional to the numbers of branches in any two given trees.
Citation:
SeungJin Lim, Yiu-Kai Ng, "Change Discovery of Hierarchically Structured, Order-Sensitive Data in HTML/XML Documents," saint, pp.178, 2004 Symposium on Applications and the Internet (SAINT'04), 2004
Usage of this product signifies your acceptance of the Terms of Use.