The Community for Technology Leaders
RSS Icon
Subscribe
July 23, 2008 to July 25, 2008
ISBN: 978-0-7695-3273-8
pp: 139-144
ABSTRACT
The spelling errors often occur in the web pages or in the user query phrases, and the non-Unicode character coding scheme used by some of the Uyghur, Kazak, and Kyrgyz language based websites have a serious impact on recall and accuracy of Uyghur, Kazak, and Kyrgyz information retrieval system (UKKIRS). In this paper, studied and proposed the most effective solutions and ideas for above actual problems: in view of the problem of character coding varieties, proposed a character code conversion method from the non-Unicode to Unicode; For spelling errors, proposed a reconstruction and a root-expansion method based on user query phrases. The experimental results indicated that, the proposed algorithms solved well the problems mentioned above, and are very dedicated to this UKKIRS.
INDEX TERMS
Character coding, Code conversion, Root expansion, Candidate Suggestion
CITATION
Turdi Tohti, Winira Musajan, Askar Hamdulla, "Character Code Conversion and Misspelled Word Processing in Uyghur, Kazak, Kyrgyz Multilingual Information Retrieval System", ALPIT, 2008, Advanced Language Processing and Web Information Technology, International Conference on, Advanced Language Processing and Web Information Technology, International Conference on 2008, pp. 139-144, doi:10.1109/ALPIT.2008.95
17 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool