loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2003 IEEE International Conference on E-Commerce Technology (CEC'03)
Page Digest for Large-Scale Web Services
Newport Beach, California
June 24-June 27
ISBN: 0-7695-1969-5
Daniel Rocco, College of Computing
David Buttler, College of Computing
Ling Liu, College of Computing
We introduce Page Digest, a mechanismfor efficient storage and processing of Web documents. The Page Digest design encourages a clean separation of the structural elements of Web documents from their content. Its encoding transformation produces many of the advantages of traditional string digest schemes yet remains invertible without introducing significant additional cost or complexity. Using the Page Digest encoding can provide at least an order of magnitude speedup when traversing a Web document as compared to using a standard Document Object Model implementation. Our experiments show that change detection using Page Digest operates in linear time, offering 75% improvement in execution performance compared with existing systems. In addition, the Page Digest encoding can reduce the tag name redundancy found in Web documents, allowing 30% to 50% reduction in document size.
Citation:
Daniel Rocco, David Buttler, Ling Liu, "Page Digest for Large-Scale Web Services," cec, pp.381, 2003 IEEE International Conference on E-Commerce Technology (CEC'03), 2003
Usage of this product signifies your acceptance of the Terms of Use.