I am new to NLTK Python and i am looking for some sample application which can do word sense disambiguation. I have got a lot of algorithms in search results but not a sample application. I just want to pass a sentence and want to know the sense of each word by referring to wordnet library. Thanks
I have found a similar module in PERL. http://marimba.d.umn.edu/allwords/allwords.html Is there such module present in NLTK Python?
Recently, part of the pywsd
code has been ported into the bleeding edge version of NLTK
' in the wsd.py
module, try:
>>> from nltk.wsd import lesk
>>> sent = 'I went to the bank to deposit my money'
>>> ambiguous = 'bank'
>>> lesk(sent, ambiguous)
Synset('bank.v.04')
>>> lesk(sent, ambiguous).definition()
u'act as the banker in a game or in gambling'
For better WSD performance, use the pywsd
library instead of the NLTK
module. In general, simple_lesk()
from pywsd
does better than lesk
from NLTK
. I'll try to update the NLTK
module as much as possible when I'm free.
In responds to Chris Spencer's comment, please note the limitations of Lesk algorithms. I'm simply giving an accurate implementation of the algorithms. It's not a silver bullet, http://en.wikipedia.org/wiki/Lesk_algorithm
Also please note that, although:
lesk("My cat likes to eat mice.", "cat", "n")
don't give you the right answer, you can use pywsd
implementation of max_similarity()
:
>>> from pywsd.similarity import max_similiarity
>>> max_similarity('my cat likes to eat mice', 'cat', 'wup', pos='n').definition
'feline mammal usually having thick soft fur and no ability to roar: domestic cats; wildcats'
>>> max_similarity('my cat likes to eat mice', 'cat', 'lin', pos='n').definition
'feline mammal usually having thick soft fur and no ability to roar: domestic cats; wildcats'
@Chris, if you want a python setup.py , just do a polite request, i'll write it...