|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
2009 WRI Global Congress on Intelligent Systems
Information Extraction Based on Table Area Locating for E-Commerce Websites
Xiamen, China
May 19-May 21
ISBN: 978-0-7695-3571-5
| ASCII Text | x | ||
| Liubo Ouyang, Rui Dong, Beiji Zou, "Information Extraction Based on Table Area Locating for E-Commerce Websites," 2010 Second WRI Global Congress on Intelligent Systems, vol. 4, pp. 441-445, 2009 WRI Global Congress on Intelligent Systems, 2009. | |||
| BibTex | x | ||
| @article{ 10.1109/GCIS.2009.310, author = {Liubo Ouyang and Rui Dong and Beiji Zou}, title = {Information Extraction Based on Table Area Locating for E-Commerce Websites}, journal ={2010 Second WRI Global Congress on Intelligent Systems}, volume = {4}, year = {2009}, isbn = {978-0-7695-3571-5}, pages = {441-445}, doi = {http://doi.ieeecomputersociety.org/10.1109/GCIS.2009.310}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - CONF JO - 2010 Second WRI Global Congress on Intelligent Systems TI - Information Extraction Based on Table Area Locating for E-Commerce Websites SN - 978-0-7695-3571-5 SP441 EP445 A1 - Liubo Ouyang, A1 - Rui Dong, A1 - Beiji Zou, PY - 2009 KW - Web Tables KW - DOM tree KW - Area location KW - Information extraction VL - 4 JA - 2010 Second WRI Global Congress on Intelligent Systems ER - | |||
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/GCIS.2009.310
Efficient extracting merchandise information is the key technology for e-commerce searching engine. By analyzing web table characters of HTML pages of e-commerce websites, this article proposes the notion of table area locating, and decomposes the merchandise information extraction into three key processes: searching Preparative Core Areas (PCA), locating Core Area (CA) and extracting attribute values from Core-Area, and then design the algorithm of locating Core Area and the algorithm of extracting attributes names and values. We experimented with the new approach on some HTML pages from various e-commerce websites. The results indicate that this approach can locate merchandise information area and extract attributes names and values efficiently, and have better performance of precise and recall.
Index Terms:
Web Tables, DOM tree, Area location, Information extraction
Citation:
Liubo Ouyang, Rui Dong, Beiji Zou, "Information Extraction Based on Table Area Locating for E-Commerce Websites," gcis, vol. 4, pp.441-445, 2009 WRI Global Congress on Intelligent Systems, 2009
Usage of this product signifies your acceptance of the Terms of Use.
