How to get vocabulary word count from gensim word2vec?

Michelle Owen picture Michelle Owen · May 12, 2016 · Viewed 23k times · Source

I am using gensim word2vec package in python. I know how to get the vocabulary from the trained model. But how to get the word count for each word in vocabulary?

Answer

user3390629 picture user3390629 · Jun 23, 2016

Each word in the vocabulary has an associated vocabulary object, which contains an index and a count.

vocab_obj = w2v.vocab["word"]
vocab_obj.count

Output for google news w2v model: 2998437

So to get the count for each word, you would iterate over all words and vocab objects in the vocabulary.

for word, vocab_obj in w2v.vocab.items():
  #Do something with vocab_obj.count