Web Intelligence and Intelligent Agent Technology, IEEE/WIC/ACM International Conference on (2011)
Aug. 22, 2011 to Aug. 27, 2011
Most traditional information retrieval systems are based on single terms indexing. However, it is admitted that semantic content of a document (or a query) cannot be accurately captured by a simple set of independent keywords. Although, several works have incorporated phrases or other syntactic information in IR, such attempts have shown slight benefit, at best. Particularly in language modeling approaches this is achieved through the use of the big ram or n-gram models. However, in these models all big rams/n-grams are considered and weighted uniformly. In this paper we introduce a new approach to weight and consider only certain types of N-grams "compound terms". Experimental results on three test collections showed an improvement.
information retrieval, language model, compound term indexing
M. Boughanem, A. Hammache and R. Ahmed-Ouamer, "A New Language Model Combining Single and Compound Terms," 2011 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies(WI-IAT), Lyon, 2011, pp. 67-70.