gensim word2vec: Find number of words in vocabulary

hlin117 picture hlin117 · Feb 24, 2016 · Viewed 56.3k times · Source

After training a word2vec model using python gensim, how do you find the number of words in the model's vocabulary?

Answer

gojomo picture gojomo · Feb 26, 2016

The vocabulary is in the vocab field of the Word2Vec model's wv property, as a dictionary, with the keys being each token (word). So it's just the usual Python for getting a dictionary's length:

len(w2v_model.wv.vocab)

(In older gensim versions before 0.13, vocab appeared directly on the model. So you would use w2v_model.vocab instead of w2v_model.wv.vocab.)