JamesR
JamesR

Reputation: 63

pheatmap in R. How to split data into groups and then cluster

See original example here for heatmap with cluttering split groups.

if (!requireNamespace("BiocManager", quietly = TRUE))
  install.packages("BiocManager")
BiocManager::install("ComplexHeatmap")

library("ComplexHeatmap")
library("cluster")

df <- scale(mtcars)

Heatmap(df, name ="mtcars", 
        split = data.frame(cyl = mtcars$cyl, am = mtcars$am),
        row_names_gp = gpar(fontsize = 7))

This produces a heatmap of Heatmap of mtcars slit into goups by am = Transmission (0 = automatic, 1 = manual).

Heatmap of mtcars slit into goups by am = Transmission (0 = automatic, 1 = manual)

So, the question is, is there anyway to do this clustering trick with split groups with Pheatmap or is it best to start over with this ComplexHeatmap package?

Upvotes: 0

Views: 1299

Answers (1)

JamesR
JamesR

Reputation: 63

After a bit more thinking I have found this solution. However, I am unable to plot the dendrogram in it's proper spot on the y axis. If this was to be presented, I would spend some time fussing with the text size and put the two images side by side.

library(pheatmap)
library(stats)

mtcars_split <- split (mtcars, mtcars$am)

clustered_ordered <- lapply (mtcars_split, function (group) {
  dist_mat <- dist(scale(group[, -which(names(group) == "am")]))
  hc <- hclust(dist_mat)
  group[hc$order, ]
})

combined <- do.call(rbind, clustered_ordered)

combined_am <- combined$am
names (combined_am) <- rownames(mtcars)

group.df.combined <- data.frame(Transmission = factor (combined_am, labels = c("Automatic", "Manueal")))
rownames(group.df.combined) <- rownames(combined)

annotation_colors <- list (Transmission = c("Automatic" = "pink", "Manueal" = "orange"))

pheatmap(combined[, -which(names(combined) == "am") ],
         cluster_rows = FALSE,
         cluster_cols = FALSE,
         annotation_row = group.df.combined,
         annotation_colors = annotation_colors,
         show_rownames = TRUE,
         show_colnames = TRUE)

pheatmap plot with group clustering

dist.mat.combined <- dist(scale(combined[, -which(names(combined) == "am")]))
hc_combined <- hclust(dist.mat.combined)
         
plot (hc_combined )

dendrogram of grouped and clustered y axis

*Edit 1. To add a second grouping, I think this is nifty and isn't demonstrated anywhere else, as far as I can tell.

library(pheatmap)
library(stats)

mtcars$cyl.am <- paste(mtcars$cyl, mtcars$am, sep = ".")
mtcars_split <- split (mtcars, mtcars$cyl.am)

clustered_ordered <- lapply (mtcars_split, function (group) {
  dist_mat <- dist(scale(group[, -which(names(group) == "cyl.am")]))
  hc <- hclust(dist_mat)
  group[hc$order, ]
})

combined <- do.call(rbind, clustered_ordered)

combined_am <- combined$am
names (combined_am) <- rownames(mtcars)
combined_cyl <- combined$cyl
names (combined_cyl) <- rownames(mtcars)

group.df.combined <- data.frame(Transmission = factor (combined_am, labels = c("Automatic", "Manueal")),
                                cylinders = factor (combined_cyl, labels = c("4", "6", "8")))

rownames(group.df.combined) <- rownames(combined)

annotation_colors <- list (Transmission = c("Automatic" = "pink", "Manueal" = "orange"),
                           cylinders = c("4" = "black", "6" = "grey", "8" = "white"))

pheatmap(combined[, -which(names(combined) == "cyl.am") ],
         cluster_rows = FALSE,
         cluster_cols = FALSE,
         annotation_row = group.df.combined,
         annotation_colors = annotation_colors,
         show_rownames = TRUE,
         show_colnames = TRUE)

2 groups clusted

*Edit 2. Just found this post which goes into detail about adding additional factors into annotation_row (similar to what i did above). But it doesn't go into grouping before clustering.

Upvotes: 0

Related Questions