Clustering using Latent Dirichlet Allocation algo in gensim

Sharmila picture Sharmila · Jun 26, 2011 · Viewed 13.8k times · Source

Is it possible to do clustering in gensim for a given set of inputs using LDA? How can I go about it?

Answer

cdf picture cdf · Jun 29, 2011

LDA produces a lower dimensional representation of the documents in a corpus. To this low-d representation you could apply a clustering algorithm, e.g. k-means. Since each axis corresponds to a topic, a simpler approach would be assigning each document to the topic onto which its projection is largest.