How to get word2index from gensim

GabrielChu picture GabrielChu · Nov 5, 2017 · Viewed 9.8k times · Source

By doc we can use this to read a word2vec model with genism

model = KeyedVectors.load_word2vec_format('word2vec.50d.txt', binary=False)

This is an index-to-word mapping, that is, e.g., model.index2word[2], how to derive an inverted mapping (word-to-index) based on this?

Answer

gojomo picture gojomo · Nov 5, 2017

The mappings from word-to-index are in the KeyedVectors vocab property, a dictionary with objects that include an index property.

For example:

word = "whatever"  # for any word in model
i = model.vocab[word].index
model.index2word[i] == word  # will be true