Gensim word2vec in python3 missing vocab

Sam Lee picture Sam Lee · Feb 28, 2017 · Viewed 14.9k times · Source

I'm using gensim implementation of Word2Vec. I have the following code snippet:

print('training model')
model = Word2Vec(Sentences(start, end))
print('trained model:', model)
print('vocab:', model.vocab.keys())

When I run this in python2, it runs as expected. The final print is all the words in the vocabulary.

However, if I run it in python3, I get an error:

trained model: Word2Vec(vocab=102, size=100, alpha=0.025)
Traceback (most recent call last):
  File "learn.py", line 58, in <module>
    train(to_datetime('-4h'), to_datetime('now'), 'model.out')
  File "learn.py", line 23, in train
    print('vocab:', model.vocab.keys())
AttributeError: 'Word2Vec' object has no attribute 'vocab'

What is going on? Is gensim word2vec not compatible with python3?

Answer

gojomo picture gojomo · Mar 1, 2017

Are you using the same version of gensim in both places? Gensim 1.0.0 moves vocab to a helper object, so whereas in pre-1.0.0 versions of gensim (in Python 2 or 3), you can use:

model.vocab

...in gensim 1.0.0+ you should instead use (in Python 2 or 3)...

model.wv.vocab