loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI'06)
Mining the Web for Transliteration Lexicons: Joint-Validation Approach
Hong Kong, China
December 18-December 22
ISBN: 0-7695-2747-7
Jong-Hoon Oh, National Institute of Information and Communications Technology (NICT), Japan
Hitoshi Isahara, National Institute of Information and Communications Technology (NICT), Japan
The Web provides the largest data collection, which reflects language use in daily life. With the advent of new technology and the flood of information on the Web, it has become quite common to create new terms supporting new concepts and translate these terms into non-Latin languages with "transliteration" referring to "translation by sound". Cross-language natural language processing applications, such as machine translation and cross-language information retrieval, usually need a translation dictionary, which affects the quality of the applications. However, the transliteration lexicons are usually unregistered in the translation dictionary. To address the problem, here, we present a transliteration lexicon acquisition model that mines the Web for transliteration lexicons. In this paper, we describe techniques of comparing phonetic-similarity to recognize transliteration pair candidates on the Web and of finding the correct transliteration pairs based on joint-validation. The techniques were evaluated against manually constructed transliteration lexicons. Our experiments revealed that the techniques effectively found transliteration lexicons on the Web.
Citation:
Jong-Hoon Oh, Hitoshi Isahara, "Mining the Web for Transliteration Lexicons: Joint-Validation Approach," wi, pp.254-261, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI'06), 2006
Usage of this product signifies your acceptance of the Terms of Use.