I am trying to get whether a word is in singular form or in plural form by using nltk pos_tag. But the results are not accurate.
So, I need a way to find how can get whether a word is in singular form or in plural form? moreover I need it without using any python package.
For English, every word should somehow have a root lemma where the default plurality is singular.
Assuming that you have only nouns in your list, you can try this:
from nltk.stem import WordNetLemmatizer
wnl = WordNetLemmatizer()
def isplural(word):
lemma = wnl.lemmatize(word, 'n')
plural = True if word is not lemma else False
return plural, lemma
nounls = ['geese', 'mice', 'bars', 'foos', 'foo',
'families', 'family', 'dog', 'dogs']
for nn in nounls:
isp, lemma = isplural(nn)
print nn, lemma, isp
You will have a problem when word is out of wordnet, then you have to use more sophiscated classifier or finite state machines out of NLTK
.