How to get vocabulary word count from gensim word2vec?

gensim word2vec

Michelle Owen · May 12, 2016 · Viewed 23k times · Source

I am using gensim word2vec package in python. I know how to get the vocabulary from the trained model. But how to get the word count for each word in vocabulary?

Answer

Each word in the vocabulary has an associated vocabulary object, which contains an index and a count.

vocab_obj = w2v.vocab["word"]
vocab_obj.count

Output for google news w2v model: 2998437

So to get the count for each word, you would iterate over all words and vocab objects in the vocabulary.

for word, vocab_obj in w2v.vocab.items():
  #Do something with vocab_obj.count

How to get vocabulary word count from gensim word2vec?

Answer

Related questions