loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Seventh International Conference on Document Analysis and Recognition (ICDAR'03) - Volume 1
Automatic Discovery of Semantic Structures in HTML Documents
Edinburgh, Scotland
August 03-August 06
ISBN: 0-7695-1960-1
Saikat Mukherjee, State University of New York at Stony Brook
Guizhen Yang, State University of New York at Stony Brook
Wenfang Tan, State University of New York at Stony Brook
I.V. Ramakrishnan, State University of New York at Stony Brook
Template-driven HTML documents posses an implicit, fixed schema denoting concepts and their relationships in a hierarchical fashion. Discovering this schema remains a relatively unexplored problem. By exploiting a key observation that semantically related items in HTML documents exhibit spatial locality, we develop an algorithm for automatically partitioning them into tree-like semantic structures which expose the implicit schema.
Citation:
Saikat Mukherjee, Guizhen Yang, Wenfang Tan, I.V. Ramakrishnan, "Automatic Discovery of Semantic Structures in HTML Documents," icdar, vol. 1, pp.245, Seventh International Conference on Document Analysis and Recognition (ICDAR'03) - Volume 1, 2003
Usage of this product signifies your acceptance of the Terms of Use.