loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Services Computing, 2004 IEEE International Conference on (SCC'04)
Segmenting the Web Document with Document Object Model
Shanghai, China
September 15-September 18
ISBN: 0-7695-2225-4
Jianli Luo, Yangzhou University, China
Jie Shen, Yangzhou University, China
Cuihua Xie, Yangzhou University, China
We present a model about DOM-based web document segmentation using the semi-structure information of web pages. This model builds DOM tree of the web page by parsing HTML tags which organize structure of the web page. By improving traditional plain text segmentation algorithms, we expand these algorithms to suit web text segmentation. Then, with the boundaries between the nodes in the DOM tree, precision of segmentation results can be increased further.
Citation:
Jianli Luo, Jie Shen, Cuihua Xie, "Segmenting the Web Document with Document Object Model," scc, pp.449-452, Services Computing, 2004 IEEE International Conference on (SCC'04), 2004
Usage of this product signifies your acceptance of the Terms of Use.