loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
6th IEEE/ACIS International Conference on Computer and Information Science (ICIS 2007)
Answering English Queries in Automatically Transcribed Arabic Speech
Melbourne, Australia
July 11-July 13
ISBN: 0-7695-2841-4
Abdusalam F. A. Nwesri, RMIT University, Australia
S. M. M. Tahaghoghi, RMIT University, Australia
Falk Scholer, RMIT University, Australia
There are several well-known approaches to parsing Arabic text in preparation for indexing and retrieval. Techniques such as stemming and stopping have been shown to improve search results on written newswire dispatches, but few comparisons are available on other data sources. In this paper, we apply several alternative stemming and stopping approaches to Arabic text automatically extracted from the audio soundtrack of news video footage, and compare these with approaches that rely on machine translation of the underlying text. Using the TRECVID video collection and queries, we show that normalisation, stop-word-removal, and light stemming increase retrieval precision, but that heavy stemming and trigrams have a negative effect. We also show that the choice of machine translation engine plays a major role in retrieval effectiveness.
Index Terms:
Arabic information retrieval, Cross-language information retrieval, Machine translation.
Citation:
Abdusalam F. A. Nwesri, S. M. M. Tahaghoghi, Falk Scholer, "Answering English Queries in Automatically Transcribed Arabic Speech," icis, pp.11-16, 6th IEEE/ACIS International Conference on Computer and Information Science (ICIS 2007), 2007
Usage of this product signifies your acceptance of the Terms of Use.