PorterStemmer doesn't seem to work

Aikin picture Aikin · Oct 19, 2012 · Viewed 14.7k times · Source

I am new to python and practising with examples from book.
Can anyone explain why when I am trying to stem some example with this code nothing is changed?

>>> from nltk.stem import PorterStemmer
>>> stemmer=PorterStemmer()
>>> stemmer.stem('numpang wifi stop gadget shopping')
'numpang wifi stop gadget shopping'

But when I do this it works

>>> stemmer.stem('shopping')
'shop'

Answer

Samuele Mattiuzzo picture Samuele Mattiuzzo · Oct 19, 2012

try this:

res = ",".join([ stemmer.stem(kw) for kw in 'numpang wifi stop gadget shopping'.split(" ")])

the problem is that, probably, that stemmer works on single words. your string has no "root" word, while the single word "shopping" has the root "shop". so you'll have to compute the stemming separately

edit:

from their source code ->

Stemming algorithms attempt to automatically remove suffixes (and in some
cases prefixes) in order to find the "root word" or stem of a given word. This
is useful in various natural language processing scenarios, such as search.

so i guess you are indeed forced to split your string by yourself