Getting NameError: name 'countVectorizer' is not defined in Pycharm

user10089194 picture user10089194 · Oct 3, 2018 · Viewed 7.1k times · Source

Need help with the error NameError: name 'countVectorizer' is not defined in PyCharm

I am trying to execute the FEATURE EXTRACTION code from this source https://github.com/chdoig/pytexas2015-ml

File Name: 1-Feature_extraction.ipynb

import numpy as np
import pandas as pd


train_data = pd.read_csv('labeledTrainData.tsv',sep='\t')
print(train_data)
print(train_data.iloc[1].review)

test_data = pd.read_csv('testData.tsv',sep = '\t')
print(test_data)

import matplotlib.pyplot as plt
import seaborn as sns

train_data['review_len'] = train_data.review.apply(len)
len_pl = plt.hist(train_data.review_len.values)
plt.show(len_pl)

#describe negative reviews
print(train_data[train_data.sentiment==0].describe())
print(train_data[train_data.sentiment==1].describe())

#inspecting outliers
print(train_data[train_data.review_len==52].review.all())
print(train_data[train_data.review_len==13708].review.all())

#word exrtaction

from sklearn.feature_extraction.text import CountVectorizer

vocab = ['awesome', 'terrible']
simple_vectorizer = countVectorizer(vocabulary=vocab)
bow = simple_vectorizer.fit_transform(train_data.review).todense()
print(bow)

Error/Warning: C:\Users\hi\PycharmProjects\Practice2\venv\Scripts\python.exe C:/Users/hi/PycharmProjects/Practice2/P1.py C:\Users\hi\PycharmProjects\Practice2\venv\lib\site-packages\sklearn\externals\joblib\externals\cloudpickle\cloudpickle.py:47: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses import imp Traceback (most recent call last): File "C:/Users/hi/PycharmProjects/Practice2/P1.py", line 32, in simple_vectorizer = countVectorizer(vocabulary=vocab) NameError: name 'countVectorizer' is not defined

Process finished with exit code 1

Answer

Sean Pianka picture Sean Pianka · Oct 3, 2018

You are importing CountVectorizer but referencing countVectorizer.