The Community for Technology Leaders
RSS Icon
Subscribe
Long Beach, CA, USA
Mar. 1, 2010 to Mar. 6, 2010
ISBN: 978-1-4244-5445-7
pp: 1121-1124
Huy Tu Phan , Department of Computer Science and Engineering, Arizona State University, Tempe, 85287, USA
Jorg Hakenberg , Department of Computer Science and Engineering, Arizona State University, Tempe, 85287, USA
Yi Chen , Department of Computer Science and Engineering, Arizona State University, Tempe, 85287, USA
Cao Son Tran , Department of Computer Science, New Mexico State University, Las Cruces, 88003, USA
Graciela Gonzalez , Department of Biomedical Informatics, Arizona State University, Phoenix, 85004, USA
Luis Tari , Department of Computer Science and Engineering, Arizona State University, Tempe, 85287, USA
ABSTRACT
Information extraction systems are traditionally implemented as a pipeline of special-purpose processing modules. A major drawback of such an approach is that whenever a new extraction goal emerges or a module is improved, extraction has to be re-applied from scratch to the entire text corpus even though only a small part of the corpus might be affected. In this demonstration proposal, we describe a novel paradigm for information extraction: we store the parse trees output by text processing in a database, and then express extraction needs using queries, which can be evaluated and optimized by databases. Compared with the existing approaches, database queries for information extraction enable generic extraction and minimize reprocessing. However, such an approach also poses a lot of technical challenges, such as language design, optimization and automatic query generation. We will present the opportunities and challenges that we met when building GenerIE, a system that implements this paradigm.
CITATION
Huy Tu Phan, Jorg Hakenberg, Yi Chen, Cao Son Tran, Graciela Gonzalez, Luis Tari, "GenerIE: Information extraction using database queries", ICDE, 2010, 2013 IEEE 29th International Conference on Data Engineering (ICDE), 2013 IEEE 29th International Conference on Data Engineering (ICDE) 2010, pp. 1121-1124, doi:10.1109/ICDE.2010.5447773
21 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool