Reputation: 285
I've started recently using R to cluster my data. My purpose is to have a heatmap with the related dendrogram and to identify on the heatmap the clusters by squares.
So far I tried hclust
from gplots
package and I could draw rectangles on the dendrogram with the code below:
a <-read.table ("test.txt", header = TRUE)
b <- as.dist(a)
dend <- hclust(b, method = "complete")
plot(dend)
groups <- cutree(dend, k=3)
rect.hclust(dend, k=3, border = "green")
My test.txt file look like this:
a b c d e f
a 1 0.1 0.9 0.5 0.65 0.9
b 0.1 1 0.39 0.83 0.47 0.63
c 0.9 0.39 1 0.42 0.56 0.84
d 0.5 0.83 0.42 1 0.95 0.43
e 0.65 0.47 0.56 0.95 1 0.14
f 0.9 0.63 0.84 0.43 0.14 1
I tried this code that works to obtain a dendrogram and the related clusters.
What I really want is something similar but with the heatmap. I would like the heatmap with squares around the clusters and the lists of the members of the different clusters. The heatmap should look like this:
As I work with large data (a 5200 x 700 matrix), I need a way to save the lists of members from each cluster.
I tried also pheatmap
from the pheatmap
package but I'm not sure about clustering and I cannot have rectangles around clusters.
I'll be very happy to welcome suggestions and comments.
Upvotes: 0
Views: 305
Reputation: 7674
Are you looking for something like this? I created the rectangle by hand, and you may be looking for R to do so, and for the three clusters above.
data <- read.table(file = "clipboard") # copied your test.txt file from above
data.m <- melt(data)
ggplot(data.m, aes(x=variable, y=value, fill=value)) + geom_tile() +
scale_fill_gradient(low="red", high="green") +
annotate("rect", xmin = min(as.integer(data.m$variable)), xmax = max(as.integer(data.m$variable)),
ymin = .01, ymax = .2, fill = "transparent", col="black", lwd=2)
Upvotes: 1