I am trying to find the most important words in a corpus based on their TF-IDF scores.
Been following along the example at https://radimrehurek.com/gensim/tut2.html. Based on
>>> for doc in corpus_tfidf:
... print(doc)
the TF-IDF score is getting updated in each iteration. For example,
So here's how I am currently getting the final TF-IDF score for each word,
tfidf = gensim.models.tfidfmodel.TfidfModel(corpus)
corpus_tfidf = tfidf[corpus]
d = {}
for doc in corpus_tfidf:
for id, value in doc:
word = dictionary.get(id)
d[word] = value
Is there a better way?
Thanks in advance.
How about using dictionary comprehensions?
d = {dictionary.get(id): value for doc in corpus_tfidf for id, value in doc}