|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
2008 IEEE 24th International Conference on Data Engineering
An Algebraic Approach to Rule-Based Information Extraction
Cancun, Mexico
April 07-April 12
ISBN: 978-1-4244-1836-7
| ASCII Text | x | ||
| Frederick Reiss, Sriram Raghavan, Rajasekar Krishnamurthy, Huaiyu Zhu, Shivakumar Vaithyanathan, "An Algebraic Approach to Rule-Based Information Extraction," Data Engineering, International Conference on, pp. 933-942, 2008 IEEE 24th International Conference on Data Engineering, 2008. | |||
| BibTex | x | ||
| @article{ 10.1109/ICDE.2008.4497502, author = {Frederick Reiss and Sriram Raghavan and Rajasekar Krishnamurthy and Huaiyu Zhu and Shivakumar Vaithyanathan}, title = {An Algebraic Approach to Rule-Based Information Extraction}, journal ={Data Engineering, International Conference on}, volume = {0}, year = {2008}, isbn = {978-1-4244-1836-7}, pages = {933-942}, doi = {http://doi.ieeecomputersociety.org/10.1109/ICDE.2008.4497502}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - CONF JO - Data Engineering, International Conference on TI - An Algebraic Approach to Rule-Based Information Extraction SN - 978-1-4244-1836-7 SP933 EP942 A1 - Frederick Reiss, A1 - Sriram Raghavan, A1 - Rajasekar Krishnamurthy, A1 - Huaiyu Zhu, A1 - Shivakumar Vaithyanathan, PY - 2008 VL - 0 JA - Data Engineering, International Conference on ER - | |||
Traditional approaches to rule-based information extraction (IE) have primarily been based on regular expression grammars. However, these grammar-based systems have difficulty scaling to large data sets and large numbers of rules. Inspired by traditional database research, we propose an algebraic approach to rule-based IE that addresses these scalability issues through query optimization. The operators of our algebra are motivated by our experience in building several rule-based extraction programs over diverse data sets. We present the operators of our algebra and propose several optimization strategies motivated by the text-specific characteristics of our operators. Finally we validate the potential benefits of our approach by extensive experiments over real-world blog data.
Citation:
Frederick Reiss, Sriram Raghavan, Rajasekar Krishnamurthy, Huaiyu Zhu, Shivakumar Vaithyanathan, "An Algebraic Approach to Rule-Based Information Extraction," icde, pp.933-942, 2008 IEEE 24th International Conference on Data Engineering, 2008
Usage of this product signifies your acceptance of the Terms of Use.
