nltk.word_tokenize() giving AttributeError: 'module' object has no attribute 'defaultdict'

Kantajit picture Kantajit · Jul 8, 2015 · Viewed 12.7k times · Source

I am new to nltk. I was trying some basics.

import nltk
nltk.word_tokenize("Tokenize me")

gives me this following error

Traceback (most recent call last):
File "<pyshell#27>", line 1, in <module>
nltk.word_tokenize("hi im no onee")
File "C:\Python27\lib\site-packages\nltk\tokenize\__init__.py", line 101, in word_tokenize
return [token for sent in sent_tokenize(text, language)
File "C:\Python27\lib\site-packages\nltk\tokenize\__init__.py", line 85, in sent_tokenize
tokenizer = load('tokenizers/punkt/{0}.pickle'.format(language))
File "C:\Python27\lib\site-packages\nltk\data.py", line 786, in load
resource_val = pickle.load(opened_resource)
AttributeError: 'module' object has no attribute 'defaultdict'

Please someone help. Please tell me how to fix this error.

Answer

Manoj picture Manoj · Aug 14, 2015

I just checked it on my system.

Fix:

>> import nltk
>> nltk.download('all')

Then everything worked fine.

>> import nltk
>> nltk.word_tokenize("Tokenize me")
['Tokenize', 'me']