Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06) (2006)
Hong Kong, China
Dec. 18, 2006 to Dec. 22, 2006
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDMW.2006.2
Peter Christen , Australian National University
Finding and matching personal names is at the core of an increasing number of applications: from text and Web mining, search engines, to information extraction, dedupli- cation and data linkage systems. Variations and errors in names make exact string matching problematic, and ap- proximate matching techniques have to be applied. When compared to general text, however, personal names have different characteristics that need to be considered. In this paper we discuss the characteristics of personal names and present potential sources of variations and errors. We then overview a comprehensive number of commonly used, as well as some recently developed name matching techniques. Experimental comparisons using four large name data sets indicate that there is no clear best matching technique.
P. Christen, "A Comparison of Personal Name Matching: Techniques and Practical Issues," Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06)(ICDMW), Hong Kong, China, 2006, pp. 290-294.