This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Database and Expert Systems Applications, 15th International Workshop on (DEXA'04)
Data Extraction from Web Data Sources
Zaragoza, Spain
August 30-September 03
ISBN: 0-7695-2195-9
Jerome Robinson, University of Essex, Colchester, U.K.
An explanation is given of the basic data structures used in a new page analysis technique to create wrappers (data extractors) for the result pages produced by web sites in response to user qeries via web page forms. The key structure called a tpGrid is a representation of the web page, which is easier to analyse than the raw html code. The analysis looks for repetition patterns of sets of tagSets, which are defined in the paper.
Citation:
Jerome Robinson, "Data Extraction from Web Data Sources," dexa, pp.282-288, Database and Expert Systems Applications, 15th International Workshop on (DEXA'04), 2004
Usage of this product signifies your acceptance of the Terms of Use.