2004 IEEE/WIC/ACM International Conference on Web Intelligence (WI'04) Tree-Structured Template Generation for Web Pages Beijing, China September 20-September 24 ISBN: 0-7695-2100-2
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/WI.2004.10101
As the web becomes an increasingly important source of information, tools for modeling, searching, and extracting information from Web pages are indispensable. By modeling the structure of a Web page defined by its markup tags, one can easily extract target information using structural templates. This paper introduces the Tree Template Automatic Generator (TTAG) that learns tree-structured templates from training Web pages. TTAG was applied to both query-based and frequently updated Web sites, and produced effective templates from a small number of examples. The experiments show that TTAG is a powerful extraction tool for semi-structured information sources.
Citation:
Shui-Lung Chuang, Jane Yung-jen Hsu, "Tree-Structured Template Generation for Web Pages," wi, pp.327-333, 2004 IEEE/WIC/ACM International Conference on Web Intelligence (WI'04), 2004 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||