Issue No.04 - July/August (2003 vol.15)
<p><b>Abstract</b>—Semantic similarity between words is becoming a generic problem for many applications of computational linguistics and artificial intelligence. This paper explores the determination of semantic similarity by a number of information sources, which consist of structural semantic information from a lexical taxonomy and information content from a corpus. To investigate how information sources could be used effectively, a variety of strategies for using various possible information sources are implemented. A new measure is then proposed which combines information sources nonlinearly. Experimental evaluation against a benchmark set of human similarity ratings demonstrates that the proposed measure significantly outperforms traditional similarity measures.</p>
Semantic similarity, lexical database, information content, corpus statistics.
Yuhua Li, Zuhair A. Bandar, David McLean, "An Approach for Measuring Semantic Similarity between Words Using Multiple Information Sources", IEEE Transactions on Knowledge & Data Engineering, vol.15, no. 4, pp. 871-882, July/August 2003, doi:10.1109/TKDE.2003.1209005