After training a word2vec model using python gensim, how do you find the number of words in the model's vocabulary?
The vocabulary is in the vocab
field of the Word2Vec model's wv
property, as a dictionary, with the keys being each token (word). So it's just the usual Python for getting a dictionary's length:
len(w2v_model.wv.vocab)
(In older gensim versions before 0.13, vocab
appeared directly on the model. So you would use w2v_model.vocab
instead of w2v_model.wv.vocab
.)