I'm getting the error NameError: name 'stopwords' is not defined
for some reason, even though I have the package installed. I'm trying to do natural language processing on some feedback reviews. The dataset
object is a table with two columns, Reviews (a sentence of feedback)
and target variable Liked (1 or 0)
. Help appreciated, thanks!
Block 1
import re
import nltk
nltk.download('stopwords')
Output 1
> [nltk_data] Downloading package stopwords to
> [nltk_data] /Users/user/nltk_data...
> [nltk_data] Package stopwords is already up-to-date!
> Out[14]: True
Block 2
dataset['Review'][0]
review = re.sub('[^a-zA-Z]',' ' ,dataset['Review'][0])
review = review.lower()
review = review.split()
review = [word for word in review if not word in stopwords.words('english')] **ERROR ON THIS LINE**
Output 2
>NameError Traceback (most recent call last)
<ipython-input-16-8d0ee1fd7c7f> in <module>()
3 review = review.lower()
4 review = review.split()
----> 5 review = [word for word in review if not word in stopwords.words('english')]
><ipython-input-16-8d0ee1fd7c7f> in <listcomp>(.0)
3 review = review.lower()
4 review = review.split()
----> 5 review = [word for word in review if not word in stopwords.words('english')]
>NameError: name 'stopwords' is not defined
you just have to add the following line before using stopwords
in your code:
from nltk.corpus import stopwords