I want to use Wu and Palmer method for computing similarity measure in wordnet,
wp = (2 X depth(lcs)) / (depth(synset1) + depth(synset2))
where lcs
is the "least common subsumer" of synset1
and synset2
My question is:
According to this paper, Least Common Subsumer of two concepts A and B is "the most specific concept which is an ancestor of both A and B", where the concept tree is defined by the is-a
relation. A concept is defined to be an ancestor of other concept just like the way you define ancestor in human family tree, which is the parent of the other concept, the grandparents, and so on. For example:
And the graph:
Object | Vehicle | --------- | | Boat Automobile | Car
In this case, "automobile" is the parent (and also ancestor) of "car", while "vehicle" is an ancestor of "car". "Vehicle" is also an ancestor of "boat". In this case, the LCS of "boat" and "car" is "vehicle", since it's the most specific concept which is an ancestor of both "boat" and "car". Note that while "object" is a common subsumer of both "boat" and "car", it is not the least, since there is still a child of "object" (in this case it's "vehicle") which is also a common subsumer of both "car" and "boat". "Automobile" is not the least common subsumer since it's not an ancestor of "boat".
To compute the similarity measure, I suggest you to use available library, otherwise you will need to build the concept graph yourself, which is troublesome.
In Perl, you can use WordNet::Similarity package
In Python, you can use nltk package, specifically, the wup_similarity
In Java, you can use ws4j package