|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| Sergio Flesca, Elio Masciari, Andrea Tagarelli, "A Fuzzy Logic Approach to Wrapping PDF Documents," IEEE Transactions on Knowledge and Data Engineering, vol. 23, no. 12, pp. 1826-1841, December, 2011. | |||
| BibTex | x | ||
| @article{ 10.1109/TKDE.2010.220, author = {Sergio Flesca and Elio Masciari and Andrea Tagarelli}, title = {A Fuzzy Logic Approach to Wrapping PDF Documents}, journal ={IEEE Transactions on Knowledge and Data Engineering}, volume = {23}, number = {12}, issn = {1041-4347}, year = {2011}, pages = {1826-1841}, doi = {http://doi.ieeecomputersociety.org/10.1109/TKDE.2010.220}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Knowledge and Data Engineering TI - A Fuzzy Logic Approach to Wrapping PDF Documents IS - 12 SN - 1041-4347 SP1826 EP1841 EPD - 1826-1841 A1 - Sergio Flesca, A1 - Elio Masciari, A1 - Andrea Tagarelli, PY - 2011 KW - Information extraction KW - fuzzy logic KW - wrapping KW - Adobe PDF KW - print-oriented documents KW - PDFWrap system. VL - 23 JA - IEEE Transactions on Knowledge and Data Engineering ER - | |||
[1] S. Flesca, S. Garruzzo, E. Masciari, and A. Tagarelli, "Wrapping PDF Documents Exploiting Uncertain Knowledge," Proc. Int'l Conf. Advanced Information Systems Eng. (CAiSE '08), pp. 175-189, 2006.
[2] Adobe Systems Incorporated, "PDF Reference, Fifth ed.: Adobe Portable Document Format version 1.6." http://partners.adobe. com/public/developer pdf, 2004.
[3] R. Baumgartner, S. Flesca, and G. Gottlob, "Visual Web Information Extraction with Lixto," Proc. 27th Int'l Conf.Very Large Databases Conf. (VLDB '01), pp. 119-128, 2001.
[4] I. Muslea, S. Minton, and C. Knoblock, "Hierarchical Wrapper Induction for Semistructured Information Sources," Autonomous Agents and Multi-Agent Systems, vol. 4, no. 1/2, pp. 93-114, 2001.
[5] C. Hsu and M. Dung, "Wrapping Semistructured Web Pages with Finite-State Transducers," Proc. Conf. Automatic Learning and Discovery, 1998.
[6] N. Kusmerick, "Wrapper Induction: Efficiency and Expressiveness," Artificial Intelligence J., vol. 118, nos. 1/2, pp. 15-68, 2000.
[7] A.H.F. Laender, B.A. Ribeiro-Neto, and A.S. da Silva, "DEByE—Data Extraction by Example," Data Knowledge and Eng., vol. 40, no. 2, pp. 121-154, 2002.
[8] V. Crescenzi and G. Mecca, "Automatic Information Extraction from Large Websites," J. ACM, vol. 51, no. 5, pp. 731-779, 2004.
[9] J. Turmo, A. Ageno, and N. Català, "Adaptive Information Extraction," ACM Computing Surveys, vol. 38, no. 2, pp. 1-47, 2006.
[10] M. Califf and R. Mooney, "Relational Learning of Pattern-Match Rules for Information Extraction," Proc. 16th Nat'l Conf. Artificial Intelligence and the 11th Conf. Innovative Applications of Artificial Intelligence (AAAI/IAAI '99), pp. 328-334, 1999.
[11] D. Freitag, "Machine Learning for Information Extraction in Informal Domains," Machine Learning, vol. 39, nos. 2/3, pp. 233-272, 2000.
[12] S. Soderland, "Learning Information Extraction Rules for Semistructured and Free Text," Machine Learning, vol. 34, nos. 1-3, pp. 233-272, 1999.
[13] A. Laender, B. Ribeiro-Neto, A. da Silva, and J. Teixeira, "A Brief Survey of Web Data Extraction Tools," ACM SIGMOD Record, vol. 31, no. 2, pp. 84-93, 2002.
[14] D. Freitag and N. Kushmerick, "Boosted Wrapper Induction," Proc. 17th Nat'l Conf. Artificial Intelligence and 12th Conf. Innovative Applications of Artificial Intelligence (AAAI/IAAI '00), pp. 577-583, 2000.
[15] L. Liu, C. Pu, and W. Han, "XWRAP: An XML-Enabled Wrapper Construction System for Web Information Sources," Proc. 16th Int'l Conf. Data Eng. (ICDE '00), pp. 611-621, 2000.
[16] D. Pinto, A. McCallum, X. Wei, and W.B. Croft, "Table Extraction Using Conditional Random Fields," Proc. 26th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '03), pp. 235-242, 2003.
[17] J.-Y. Ramel, M. Crucianu, N. Vincent, and C. Faure, "Detection, Extraction and Representation of Tables," Proc. Int'l Conf. Document Analysis and Recognition (ICDAR), pp. 374-378, 2003.
[18] Y. Liu, P. Mitra, and C.L. Giles, "Identifying Table Boundaries in Digital Documents via Sparse Line Detection," Proc. 17th Conf. Information and Knowledge Management (CIKM '08), pp. 1311-1320, 2008.
[19] T. Hassan and R. Baumgartner, "Table Recognition and Understanding from PDF Files," Proc. Int'l Conf. Document Analysis and Recognition (ICDAR), pp. 1143-1147, 2007.
[20] B. Yildiz, K. Kaiser, and S. Miksch, "pdf2table: A Method to Extract Table Information from PDF Files," Proc. Indian Int'l Conf. Artificial Intelligence (IICAI), pp. 1773-1785, 2005.
[21] Y. Liu, K. Bai, P. Mitra, and C.L. Giles, "Improving the Table Boundary Detection in PDFs by Fixing the Sequence Error of the Sparse Lines," Proc. Int'l Conf. Document Analysis and Recognition (ICDAR), pp. 1006-1010, 2009.
[22] H. Djean and J.-L. Meunier, "A System for Converting PDF Documents into Structured XML Format," Proc. Int'l Workshop Document Analysis Systems, pp. 129-140, 2006.
[23] M.A. Bhatti and A. Ahmad, "PDF to HTML Conversion: Having a Usable Web Document," Proc. Int'l Conf. Digital Information Management, pp. 289-293, 2006.
[24] F. Yuan, B. Liu, and G. Yu, "A Study on Information Extraction from PDF Files," Proc. Int'l Conf. Advances in Machine Learning and Cybernetics (ICMLC), pp. 258-267, 2005.
[25] T. Hassan and R. Baumgartner, "Intelligent Text Extraction from PDF Documents," Proc. Int'l Conf. Computational Intelligence for Modelling, Control and Automation (CIMCA) and Int'l Conf. Intelligent Agents, Web Technologies and Internet Commerce (IAWTIC), pp. 2-6, 2005.
[26] T. Hassan and R. Baumgartner, "Using Graph Matching Techniques to Wrap Data from PDF Documents," Proc. 15th Int'l Conf. World Wide Web (WWW '06), pp. 901-902, 2006.
[27] T. Hassan, "Use-Guided Wrapping of PDF Documents Using Graph Matching Techniques," Proc. Int'l Conf. Document Analysis and Recognition (ICDAR), pp. 631-635, 2009.
[28] L. Zadeh, "Fuzzy Sets," Information and Control, vol. 8, pp. 338-353, 1965.
[29] M. Wygralak, "Fuzzy Cardinals Based on the Generalized Equality of Fuzzy Subsets," Fuzzy Sets and Systems, vol. 18, pp. 143-158, 1986.
[30] Adobe Systems Incorporated, "Document Management—Portable Document Format—Part 1: PDF 1.7." http://www.adobe.com/devnet/acrobat/pdfs PDF32000_2008.pdf, 2011.
[31] S. Skiadopoulos and M. Koubarakis, "Composing Cardinal Direction Relations," Artificial Intelligence, vol. 152, no. 2, pp. 143-171, 2004.
[32] R. Goyal and M. Egenhofer, "Similarity of Cardinal Directions," Proc. 7th Int'l Symp. Advances in Spatial and Temporal Databases, pp. 36-58, 2001.
[33] S. Patwardhan, S. Banerjee, and T. Pedersen, "Using Measures of Semantic Relatedness for Word Sense Disambiguation," Proc. Int'l Conf. Intelligent Text Processing and Computational Linguistics (CICLing '03), pp. 241-257, 2003.
[34] A. Bruggemann-Klein and D. Wood, "One-Unambiguous Regular Languages," Information and Computation, vol. 142, no. 2, pp. 182-206, 1998.
[35] N. Chinchor, "MUC-4 Evaluation Metrics," Proc. Message Understanding Conf. (MUC), pp. 22-29, 1992.

