Clustering using Latent Dirichlet Allocation algo in gensim

python algorithm cluster-analysis latent-semantic-indexing

Sharmila · Jun 26, 2011 · Viewed 13.8k times · Source

Is it possible to do clustering in gensim for a given set of inputs using LDA? How can I go about it?

Answer

LDA produces a lower dimensional representation of the documents in a corpus. To this low-d representation you could apply a clustering algorithm, e.g. k-means. Since each axis corresponds to a topic, a simpler approach would be assigning each document to the topic onto which its projection is largest.

Clustering using Latent Dirichlet Allocation algo in gensim

Answer

Related questions