I am trying to plot the results of a hierarchical clustering in R
as a dendrogram, with rectangles identifying clusters.
The following code does the trick for a vertical dendrogram, but for a horizontal dendrogram, (horiz=TRUE
), the rectangles are not drawn. Is there any way to do the same for horizontal dendrograms too.
library("cluster")
dst <- daisy(iris, metric = c("gower"), stand = FALSE)
hca <- hclust(dst, method = "average")
plot(as.dendrogram(hca), horiz = FALSE)
rect.hclust(hca, k = 3, border = "red")
Moreover I would like to plot a line to cut the tree at a desired distance value. How to plot that in R. The cutree
function returns the clusters, but is it possible to plot it as well.
cutree(hca, k = 3)
The desired output that I am looking for is like this.
How to get this done in R?
Both jlhoward and Backlin answers are good.
What you could also try is using the dendextend
package, designed exactly for this sort of thing. It has a rect.dendrogram
function which works like rect.hclust
, but with a horiz parameter (plus some more control over the location of the edge of the rect). For finding the relevant height you can use the heights_per_k.dendrogram
function (which is much faster when also using the dendextendRcpp
package)
Here is a simple example for how you would get the same result as in the above examples (with an added bonus of colored branches, just for fun):
install.packages("dendextend")
install.packages("dendextendRcpp")
library("dendextend")
library("dendextendRcpp")
# using piping to get the dend
dend <- iris[,-5] %>% dist %>% hclust %>% as.dendrogram
# plot + color the dend's branches before, based on 3 clusters:
dend %>% color_branches(k=3) %>% plot(horiz=TRUE, main = "The dendextend package \n Gives extended functionality to R's dendrogram object")
# add horiz rect
dend %>% rect.dendrogram(k=3,horiz=TRUE)
# add horiz (well, vertical) line:
abline(v = heights_per_k.dendrogram(dend)["3"] + .6, lwd = 2, lty = 2, col = "blue")