Ninth International Conference on Document Analysis and Recognition (ICDAR 2007) Vol 2 A General Approach for Partitioning Web Page Content Based on Geometric and Style Information Curitiba, Parana, Brazil September 23-September 26 ISBN: 0-7695-2822-8
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDAR.2007.10
In this paper, we describe a general-purpose approach for partitioning Web page content. The novelty of our ap- proach lies in the use of detailed layout information from a Web page renderer to determine spatial locality and identify visual separators, and the use of relaxed matching over pre- sentation style information to determine presentation style similarity. We present several examples to illustrate the gen- erality of our approach.
Citation:
H. Guo, J. Mahmud, Y. Borodin, A. Stent, I. Ramakrishnan, "A General Approach for Partitioning Web Page Content Based on Geometric and Style Information," icdar, vol. 2, pp.929-933, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007) Vol 2, 2007 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||