Fabrizio
Fabrizio

Reputation: 939

pheatmap clustering order

I have this dataset:

> dput(mdata2)
structure(list(EE = c(3.3221428469822, 3.62699732299098, 1.75430154205983, 
0.809228977410138, 1.24117055233438, 2.93403148663873, 4.01630566539058, 
1.5334176884274, 0.598331636908125, 0.793972781447563), MI = c(3.20812756072619, 
3.73729312689288, 2.32268411219261, 1.16578395871478, 2.02899881030574, 
2.43723772029964, 3.98855299963887, 1.76088057576795, 0.684310806612439, 
1.512739619069), PI = c(2.97858874003521, 3.77000551572515, 2.82873533944253, 
2.17460752133963, 2.81526651451227, 2.31001492452861, 4.0325069645006, 
2.35155407135517, 1.41519706213454, 2.62939873195416)), row.names = c("A", 
"B", "C", "D", "E", "F", "G", "H", "I", "J"), class = "data.frame")

The columns are PI, MI, and EE. I am trying to create a heatmap of this, and to do that I do the following:

pheatmap(mdata2, cluster_cols = TRUE, clustering_method = "ward.D")

This create this image:

heatmap

The problem for me is that, the position of one of the column does not make sense to me. If I look at the geometrical distance between EE and MI I get

sum((mdata2[, "EE"] - mdata2[,"MI"])^2)
1.91936

Then between EE and PI

sum((mdata2[, "EE"] - mdata2[,"PI"])^2)
10.72999

and finally between MI and PI

sum((mdata2[, "MI"] - mdata2[,"PI"])^2)
4.093923

So, it is correct that the clustering put together EE and MI, since also their distance is the lowest. But why it does put on the left PI? Shouldn't it be close to MI since their distance is lower than the distance between PI and MI? I would like to conserve the clustering of the columns, but at the same time have MI and PI close together because their distance is low. Take in mind I need to apply this to other data, so I need a general answer not one that is "ad-hoc" only for this example. Thanks.

Upvotes: 0

Views: 51

Answers (1)

Fabrizio
Fabrizio

Reputation: 939

It turns out that a similar question was already asked here: pheatmap: manually re-order leaves in dendogram

For me this worked:

callback = function(hc, mat){
  sv = svd(t(mat))$v[,c(2)]
  dend = reorder(as.dendrogram(hc), wts = sv)
  as.hclust(dend)
}

pheatmap(mdata2, clustering_callback = callback)

Upvotes: 0

Related Questions