Seventh International Conference on Document Analysis and Recognition (ICDAR'03) - Volume 1 Automatic Discovery of Semantic Structures in HTML Documents Edinburgh, Scotland August 03-August 06 ISBN: 0-7695-1960-1
Template-driven HTML documents posses an implicit, fixed schema denoting concepts and their relationships in a hierarchical fashion. Discovering this schema remains a relatively unexplored problem. By exploiting a key observation that semantically related items in HTML documents exhibit spatial locality, we develop an algorithm for automatically partitioning them into tree-like semantic structures which expose the implicit schema.
Citation:
Saikat Mukherjee, Guizhen Yang, Wenfang Tan, I.V. Ramakrishnan, "Automatic Discovery of Semantic Structures in HTML Documents," icdar, vol. 1, pp.245, Seventh International Conference on Document Analysis and Recognition (ICDAR'03) - Volume 1, 2003 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||