loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
International Conference on Semantic Computing (ICSC 2007)
A Graph Modeling of Semantic Similarity between Words
Irvine, California
September 17-September 19
ISBN: 0-7695-2997-6
Marco A. Alvarez, Utah State University, USA
SeungJin Lim, Utah State University, USA
The problem of measuring the semantic similarity between pairs of words has been considered a fundamental operation in data mining and information retrieval. Nevertheless, developing a computational method capable of generating satisfactory results close to what humans would perceive is still a difficult task somewhat owed to the subjective nature of similarity. In this paper, it is presented a novel algorithm for scoring the semantic similarity (SSA) between words. Given two input words w_1 and w_2, SSA exploits their corresponding concepts, relationships, and descriptive glosses available in WordNet in order to build a rooted weighted graph G_sim. The output score is calculated by exploring the concepts present in Gsim and selecting the minimal distance between any two concepts c_1 and c)2 of w_1 and w_2 respectively. The definition of distance is a combination of: 1) the depth of the nearest common ancestor between c_1 and c_2 in G_sim, 2) the intersection of the descriptive glosses of c_1 and c_2, and 3) the shortest distance between c_1 and c_2 in G_sim. A correlation of 0.913 has been achieved between the results by SSA and the human ratings reported by Miller and Charles [15] for a dataset of 28 pairs of nouns. Furthermore, using the full dataset of 65 pairs presented by Rubenstein and Goodenough [20], the correlation between SSA results and the known human ratings is 0.903, which is higher than all other reported algorithms for the same dataset. The high correlations of SSA with human ratings suggest that SSA would be convenient in solving several data mining and information retrieval problems.
Citation:
Marco A. Alvarez, SeungJin Lim, "A Graph Modeling of Semantic Similarity between Words," icsc, pp.355-362, International Conference on Semantic Computing (ICSC 2007), 2007
Usage of this product signifies your acceptance of the Terms of Use.