Ali
Ali

Reputation: 1080

Comparing hierarchical clusterings in R

I'm using the package dendextend and function cor_copheneticto calculate the cophenetic distance between 6 hierarchical clusters. In which R outputs the correlations between them.

Currently the code I'm using is simply:

cor_cophenetic(hcr1,hcr2)
cor_cophenetic(hcr1,hcr3)
cor_cophenetic(hcr1,hcr4)
cor_cophenetic(hcr1,hcr5)
cor_cophenetic(hcr1,hcr6)
cor_cophenetic(hcr2,hcr3)
            :
            :
cor_cophenetic(hcr4,hcr6)
cor_cophenetic(hcr5,hcr6)

which outputs the correlations individually.

I know there is a function outer which can do this but I'm not sure how to incorporate this command into it. I'm trying to get the output as a 15x15 matrix.

Also this is simply calculating the correlations. Is there any method in which I can visually compare two dendrograms?

Upvotes: 1

Views: 464

Answers (1)

AkselA
AkselA

Reputation: 8836

After further reading I found out that while cor_cophenetic() can't process more than two dendlist elements at a time, cor.dendlist() can, and will compute cophenetic correlation (among other things), which makes things a whole lot simpler.

names(dend.l) <- met
round(cor.dendlist(dend.l), 4)
#          complete single average centroid
# complete   1.0000 0.4925  0.6044   0.4822
# single     0.4925 1.0000  0.9851   0.9959
# average    0.6044 0.9851  1.0000   0.9871
# centroid   0.4822 0.9959  0.9871   1.0000

Original answer using cor_cophenetic() and example data:

I don't think you can make outer() work for this, as it expects an object of type atomic (vector, matrix or array). We'll have to roll our own using expand.grid() and apply().

library(dendextend)
library(magrittr)

# example data
set.seed(23235)
ss <- sample(1:150, 10 )

dend.l <- dendlist()
met <- c("complete", "single", "average", "centroid")

for (i in 1:length(met)) {
    dend <- iris[ss,-5] %>% dist %>% hclust(met[i])
    dend.l[[i]] <- as.dendrogram(dend)
}

ind <- expand.grid(1:length(dend.l), 1:length(dend.l))

# turns out cor_cophenetic has a method for dendlist where you can
# specify which elements you want to compare. Simplifies things a little
v <- apply(ind, 1, function(x) cor_cophenetic(dend.l, x))
m <- matrix(v, length(dend.l))
dimnames(m) <- list(met, met)

round(m, 4)
#          complete single average centroid
# complete   1.0000 0.4925  0.6044   0.4822
# single     0.4925 1.0000  0.9851   0.9959
# average    0.6044 0.9851  1.0000   0.9871
# centroid   0.4822 0.9959  0.9871   1.0000

As you can see the matrix is symmetric, so we could get away with combn() instead of expand.grid, which would give us just one of the triangles.

As for comparing two dendrograms visually, take a look at Introduction to dendextend section Comparing two dendrograms.


Upvotes: 3

Related Questions