|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
2010 IEEE 26th International Conference on Data Engineering (ICDE 2010)
GenerIE: Information extraction using database queries
Long Beach, CA, USA
March 01-March 06
ISBN: 978-1-4244-5445-7
| ASCII Text | x | ||
| Luis Tari, Huy Tu Phan, Jorg Hakenberg, Yi Chen, Cao Son Tran, Graciela Gonzalez, Chitta Baral, "GenerIE: Information extraction using database queries," Data Engineering, International Conference on, pp. 1121-1124, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010), 2010. | |||
| BibTex | x | ||
| @article{ 10.1109/ICDE.2010.5447773, author = {Luis Tari and Huy Tu Phan and Jorg Hakenberg and Yi Chen and Cao Son Tran and Graciela Gonzalez and Chitta Baral}, title = {GenerIE: Information extraction using database queries}, journal ={Data Engineering, International Conference on}, volume = {0}, year = {2010}, isbn = {978-1-4244-5445-7}, pages = {1121-1124}, doi = {http://doi.ieeecomputersociety.org/10.1109/ICDE.2010.5447773}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - CONF JO - Data Engineering, International Conference on TI - GenerIE: Information extraction using database queries SN - 978-1-4244-5445-7 SP1121 EP1124 A1 - Luis Tari, A1 - Huy Tu Phan, A1 - Jorg Hakenberg, A1 - Yi Chen, A1 - Cao Son Tran, A1 - Graciela Gonzalez, A1 - Chitta Baral, PY - 2010 VL - 0 JA - Data Engineering, International Conference on ER - | |||
Information extraction systems are traditionally implemented as a pipeline of special-purpose processing modules. A major drawback of such an approach is that whenever a new extraction goal emerges or a module is improved, extraction has to be re-applied from scratch to the entire text corpus even though only a small part of the corpus might be affected. In this demonstration proposal, we describe a novel paradigm for information extraction: we store the parse trees output by text processing in a database, and then express extraction needs using queries, which can be evaluated and optimized by databases. Compared with the existing approaches, database queries for information extraction enable generic extraction and minimize reprocessing. However, such an approach also poses a lot of technical challenges, such as language design, optimization and automatic query generation. We will present the opportunities and challenges that we met when building GenerIE, a system that implements this paradigm.
Citation:
Luis Tari, Huy Tu Phan, Jorg Hakenberg, Yi Chen, Cao Son Tran, Graciela Gonzalez, Chitta Baral, "GenerIE: Information extraction using database queries," icde, pp.1121-1124, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010), 2010
Usage of this product signifies your acceptance of the Terms of Use.
