How to remove stop words using nltk or python

Alex picture Alex · Mar 30, 2011 · Viewed 183.4k times · Source

So I have a dataset that I would like to remove stop words from using

stopwords.words('english')

I'm struggling how to use this within my code to just simply take out these words. I have a list of the words from this dataset already, the part i'm struggling with is comparing to this list and removing the stop words. Any help is appreciated.

Answer

Daren Thomas picture Daren Thomas · Mar 30, 2011
from nltk.corpus import stopwords
# ...
filtered_words = [word for word in word_list if word not in stopwords.words('english')]