By doc we can use this to read a word2vec model with genism
model = KeyedVectors.load_word2vec_format('word2vec.50d.txt', binary=False)
This is an index-to-word mapping, that is, e.g., model.index2word[2]
, how to derive an inverted mapping (word-to-index) based on this?
The mappings from word-to-index are in the KeyedVectors
vocab
property, a dictionary with objects that include an index
property.
For example:
word = "whatever" # for any word in model
i = model.vocab[word].index
model.index2word[i] == word # will be true