loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Third IEEE International Conference on Data Mining (ICDM'03)
Mining the Web to Discover the Meanings of an Ambiguous Word
Melbourne, Florida
November 19-November 22
ISBN: 0-7695-1978-4
Raz Tamir, The Hebrew University of Jerusalem
Reinhard Rapp, Johannes Gutenberg-Universit?t Mainz
In information retrieval and text mining, information on word senses is usually taken from dictionaries or lexical databases that have been prepared by lexicographers. In this paper we propose an automatic method for word sense induction, i.e. for the discovery of a set of sense descriptors to a given ambiguous word. The approach is based on the statistics of word co-occurrence as derived from web pages. The underlying assumption is that the senses of an ambiguous word are best described by terms that, although bearing a strong association to this word, are mutually exclusive, i.e. whose association strength within the retrieved web pages is as weak as possible. Measuring association strength is based upon a novel Confidence Gain approach that relates the observed co-occurrence frequency for two sense descriptor candidates to an average co-occurrence frequency for pairs of arbitrary words. The proposed approach is fully unsupervised and takes into account the contemporary meanings of words, as reflected in texts from the internet. Our results are evaluated using a list of ambiguous words commonly referred to in the literature.
Citation:
Raz Tamir, Reinhard Rapp, "Mining the Web to Discover the Meanings of an Ambiguous Word," icdm, pp.645, Third IEEE International Conference on Data Mining (ICDM'03), 2003
Usage of this product signifies your acceptance of the Terms of Use.