I've got a trained LDA model and I want to calculate the similarity score between two documents from the corpus I trained my model on. After studying all the Gensim tutorials and functions, I still can't get my head around it. Can somebody give me a hint? Thanks!
Depends what similarity metric you want to use.
Cosine similarity is universally useful & built-in:
sim = gensim.matutils.cossim(vec_lda1, vec_lda2)
Hellinger distance is useful for similarity between probability distributions (such as LDA topics):
import numpy as np
dense1 = gensim.matutils.sparse2full(lda_vec1, lda.num_topics)
dense2 = gensim.matutils.sparse2full(lda_vec2, lda.num_topics)
sim = np.sqrt(0.5 * ((np.sqrt(dense1) - np.sqrt(dense2))**2).sum())