This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2009 WRI World Congress on Computer Science and Information Engineering
A Novel Algorithm for Normalizing Noisy Arabic Text
Los Angeles, California USA
March 31-April 02
ISBN: 978-0-7695-3507-4
In this paper, an algorithm to normalize noisy text, which only focuses on the Arabic language, is introduced. Although there have been many theories that discuss Arabic text processing, there has not been, so far, one theory that focuses on noisy Arabic texts. Additionally, this paper introduces a new similarity measure to stem Arabic noisy document. The need for such a new measure stems from the fact that the common rules applied in stemming cannot be applied on noisy texts, which do not conform to the known grammatical rules and have various spelling mistakes. Thus, the proposed normalization algorithm automatically group words after applying the similarity measure. In order to make sure of such a theory of algorithm, the new normalization technique is evaluated by the under-stemming errors reduction technique introduced by Paice.
Index Terms:
Arabic, Stemming, Text processing
Citation:
Eiman Tamah Al-Shammari, "A Novel Algorithm for Normalizing Noisy Arabic Text," csie, vol. 4, pp.477-482, 2009 WRI World Congress on Computer Science and Information Engineering, 2009
Usage of this product signifies your acceptance of the Terms of Use.