The Community for Technology Leaders
RSS Icon
Subscribe
Zaragoza, Spain
Aug. 30, 2004 to Sept. 3, 2004
ISBN: 0-7695-2195-9
pp: 282-288
Jerome Robinson , University of Essex, Colchester, U.K.
ABSTRACT
An explanation is given of the basic data structures used in a new page analysis technique to create wrappers (data extractors) for the result pages produced by web sites in response to user qeries via web page forms. The key structure called a tpGrid is a representation of the web page, which is easier to analyse than the raw html code. The analysis looks for repetition patterns of sets of tagSets, which are defined in the paper.
INDEX TERMS
null
CITATION
Jerome Robinson, "Data Extraction from Web Data Sources", DEXA, 2004, 2012 23rd International Workshop on Database and Expert Systems Applications, 2012 23rd International Workshop on Database and Expert Systems Applications 2004, pp. 282-288, doi:10.1109/DEXA.2004.1333487
19 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool