This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
AML: Efficient Approximate Membership Localization within a Web-Based Join Framework
Feb. 2013 (vol. 25 no. 2)
pp. 298-310
Zhixu Li, The University of Queensland, Brisbane
Laurianne Sitbon, Queensland University of Technology, Brisbane
Liwei Wang, Wuhan University, Wuhan
Xiaofang Zhou, The University of Queensland, Brisbane
Xiaoyong Du, Renmin University of China, Beijing
In this paper, we propose a new type of Dictionary-based Entity Recognition Problem, named Approximate Membership Localization (AML). The popular Approximate Membership Extraction (AME) provides a full coverage to the true matched substrings from a given document, but many redundancies cause a low efficiency of the AME process and deteriorate the performance of real-world applications using the extracted substrings. The AML problem targets at locating nonoverlapped substrings which is a better approximation to the true matched substrings without generating overlapped redundancies. In order to perform AML efficiently, we propose the optimized algorithm P-Prune that prunes a large part of overlapped redundant matched substrings before generating them. Our study using several real-word data sets demonstrates the efficiency of P-Prune over a baseline method. We also study the AML in application to a proposed web-based join framework scenario which is a search-based approach joining two tables using dictionary-based entity recognition from web documents. The results not only prove the advantage of AML over AME, but also demonstrate the effectiveness of our search-based approach.
Index Terms:
Dictionaries,Redundancy,Approximation methods,Approximation algorithms,Correlation,Web search,Pattern matching,AML,Web-based join,approximate membership location
Citation:
Zhixu Li, Laurianne Sitbon, Liwei Wang, Xiaofang Zhou, Xiaoyong Du, "AML: Efficient Approximate Membership Localization within a Web-Based Join Framework," IEEE Transactions on Knowledge and Data Engineering, vol. 25, no. 2, pp. 298-310, Feb. 2013, doi:10.1109/TKDE.2011.178
Usage of this product signifies your acceptance of the Terms of Use.