Tzac
Tzac

Reputation: 101

How to cut a dendrogram in r

Okay so I'm sure this has been asked before but I can't find a nice answer anywhere after many hours of searching.

I have some data, I run a classification then I make a dendrogram.

The problem has to do with aesthetics, specifically; (1) how to cut according to the number of groups (in this example I want 3), (2) make the group labels aligned with the branches of the trees, (2) Re-scale so that there aren't any huge gaps between the groups

More on (3). I have dataset which is very species rich and there would be ~1000 groups without cutting. If I cut at say 3, the tree has some branches on the right and one 'miles' off to the right which I would want to re-scale so that its closer. All of this is possible via external programs but I want to do it all in r!

  1. Bonus points if you can put an average silhouette width plot nested into the top right of this plot

Here is example using iris data

library(ggplot2)
data(iris)
df = data.frame(iris)
df$Species = NULL

ED10 = vegdist(df,method="euclidean")
EucWard_10 = hclust(ED10,method="ward.D2")
hcd_ward10 = as.dendrogram(EucWard_10)

plot(hcd_ward10)
plot(cut(hcd_ward10, h = 10)$upper, main = "Upper tree of cut at h=75")

Upvotes: 1

Views: 2479

Answers (1)

Tal Galili
Tal Galili

Reputation: 25306

I suspect what you would want to look at is the dendextend R package (it also has a paper in bioinformatics).

I am not fully sure about your question on (3), since I am not sure I understand what rescaling means. What I can tell you is that you can do quite a lot of dendextend. Here is a quick example for coloring the branches and labels for 3 groups.

library(ggplot2)
library(vegan)
data(iris)
df = data.frame(iris)
df$Species = NULL

library(vegan)
ED10 = vegdist(df,method="euclidean")
EucWard_10 = hclust(ED10,method="ward.D2")
hcd_ward10 = as.dendrogram(EucWard_10)

plot(hcd_ward10)

install.packages("dendextend")
library(dendextend)
dend <- hcd_ward10
dend <- color_branches(dend, k = 3)
dend <- color_labels(dend, k = 3)
plot(dend)

enter image description here

You can also get an interactive dendrogram by using plotly (ggplot method is available through dendextend):

library(plotly)
library(ggplot2)
p <- ggplot(dend)
ggplotly(p)

enter image description here

Upvotes: 1

Related Questions