loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
16th International Conference on Data Engineering (ICDE'00)
XWRAP: An XML-Enabled Wrapper Construction System for Web Information Sources
San Diego, California
February 28-March 03
ISBN: 0-7695-0506-6
Ling Liu, Georgia Institute ofTechnology
Calton Pu, Georgia Institute ofTechnology
Wei Han, Georgia Institute ofTechnology
This paper describes the methodology and the software development of XWRAP, an XML-enabled wrapper construction system for semi-automatic generation of wrapper programs. By XML-enabled we mean that the metadata about information content that are implicit in the original web pages will be extracted and encoded explicitly as XML tags in the wrapped documents. In addition, the query-based content filtering process is performed against the XML documents.The XWRAP wrapper generation framework has three distinct features. First, it explicitly separates tasks of building wrappers that are specific to a Web source from the tasks that are repetitive for any source, and uses a component library to provide basic building blocks for wrapper programs. Second, it provides a user-friendly interface program to allow wrapper developers to generate their wrapper code with a few mouse clicks. Third and most importantly, we introduce and develop a two-phase code generation framework.The first phase utilizes an interactive interface facility to encode the source-specific metadata knowledge identified by individual wrapper developers as declarative information extraction rules. The second phase combines the information extraction rules generated at the first phase with the XWRAP component library to construct an executable wrapper program for the given web source. We report the initial experiments on performance of the XWRAP code generation system and the wrapper programs generated by XWRAP.
Index Terms:
Web data management, Information extraction, Wrapper generation system, XML
Citation:
Ling Liu, Calton Pu, Wei Han, "XWRAP: An XML-Enabled Wrapper Construction System for Web Information Sources," icde, pp.611, 16th International Conference on Data Engineering (ICDE'00), 2000
Usage of this product signifies your acceptance of the Terms of Use.