Plot dendrogram using sklearn.AgglomerativeClustering

Shukhrat  Khannanov picture Shukhrat Khannanov · Mar 18, 2015 · Viewed 26.6k times · Source

I'm trying to build a dendrogram using the children_ attribute provided by AgglomerativeClustering, but so far I'm out of luck. I can't use scipy.cluster since agglomerative clustering provided in scipy lacks some options that are important to me (such as the option to specify the amount of clusters). I would be really grateful for a any advice out there.

    import sklearn.cluster
    clstr = cluster.AgglomerativeClustering(n_clusters=2)
    clusterer.children_

Answer

David Diaz picture David Diaz · Sep 20, 2017

Here is a simple function for taking a hierarchical clustering model from sklearn and plotting it using the scipy dendrogram function. Seems like graphing functions are often not directly supported in sklearn. You can find an interesting discussion of that related to the pull request for this plot_dendrogram code snippet here.

I'd clarify that the use case you describe (defining number of clusters) is available in scipy: after you've performed the hierarchical clustering using scipy's linkage you can cut the hierarchy to whatever number of clusters you want using fcluster with number of clusters specified in the t argument and criterion='maxclust' argument.