loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2009 IEEE International Conference on Data Mining Workshops
Improving Similarity Join Algorithms Using Fuzzy Clustering Technique
Miami, Florida, USA
December 06-December 06
ISBN: 978-0-7695-3902-7
In this paper, we propose a pre-processing technique to improve existing string similarity join algorithms using fuzzy clustering. Our approach first identifies groups of related attributes and then, using this information, we apply existing string similarity join algorithms on these attributes. To identify the clustered attributes we use fuzzy techniques. This approach can be applied to the integration of knowledge bases and databases, as well as handle inconsistent values and naming conventions, incorrect or missing data values, and incomplete information from multiple sources with semi-compatible attributes or homogenous attributes. Using an experimental study, we have shown our preprocessing approach improves existing string similarity join algorithms by about 10 percent on precision and recall.
Citation:
Lisa Tan, Farshad Fotouhi, William Grosky, Horia F. Pop, Noureddine Mouaddib, "Improving Similarity Join Algorithms Using Fuzzy Clustering Technique," icdmw, pp.545-550, 2009 IEEE International Conference on Data Mining Workshops, 2009
Usage of this product signifies your acceptance of the Terms of Use.