Issue No.04 - July/August (2003 vol.15)
<p><b>Abstract</b>—Semantic similarity between words is becoming a generic problem for many applications of computational linguistics and artificial intelligence. This paper explores the determination of semantic similarity by a number of information sources, which consist of structural semantic information from a lexical taxonomy and information content from a corpus. To investigate how information sources could be used effectively, a variety of strategies for using various possible information sources are implemented. A new measure is then proposed which combines information sources nonlinearly. Experimental evaluation against a benchmark set of human similarity ratings demonstrates that the proposed measure significantly outperforms traditional similarity measures.</p>
Semantic similarity, lexical database, information content, corpus statistics.
Zuhair A. Bandar, Yuhua Li, "An Approach for Measuring Semantic Similarity between Words Using Multiple Information Sources", IEEE Transactions on Knowledge & Data Engineering, vol.15, no. 4, pp. 871-882, July/August 2003, doi:10.1109/TKDE.2003.1209005