Reputation: 360
I have a database I did hierarchical clustering on (with agnes()
) and it worked well (I did it like here described: https://uc-r.github.io/hc_clustering. Now I want to compare manmade clusters or classes in the database with the ones that the hierarchical clustering found. I think I can do this with tanglegram()
.
I do not know how to generate a dendrogram/ doing hierarchical clustering when I already have groups. How can I tell R about the groups?
It would be great if you could answer this question methodical.
`
set.seed(73)
great <- data.frame(c0=c("r1","r2","r3","r4","r5","r6"),c1=c("0.89","46","0","0.56","12","0"),c2=c("0","0.45","45","79","0.45","4.4"))
#euclidean distance
great_dist <- dist(great)
#agglomerative with agnes()
#wards minimizes total within cluster variance
#minimum between-cluster-distance is merged
hc1_wards <- agnes(great,method ="ward")
#agglomerative coefficient
hc1_wards$ac
hc1_wards_plot <- pltree(hc1_wards, cex = 0.6, hang = -1, main = "Dendrogram\nagglomerative clustering",labels=F)
#cutting into a specific amount of clusters
#average silhouette method
fviz_nbclust(great, FUN = hcut, method = "silhouette")
# Cut tree into 2 groups
great_grp <-
agnes(great, method = "ward")
great_grp_cut <- cutree(as.hclust(great), k = 2)
#using the cutree output to add the cluster each observation belongs to sub
great_cluster <- mutate(great,cluster = great_grp_cut)
#evaluating goodness of cluster with dunn()
#with count() how many obs. in one cluster
count(great_cluster,cluster)
dunn <- clValid::dunn(distance = great_dist,clusters = great_grp_cut)
`
The lines 1,2,4 und 3,5,6 are manmade clusters of great.
cl1 <- great[c(1,2,4), ]
cl2 <- great[c(3,5,6, ]
I want to compare the hierarchical clustering and manmade clustering. How can I perform a dendrogram with the manmade clustering in order to compare them with tenglegram()
. Is there another way to compare them?
Upvotes: 1
Views: 296
Reputation: 9656
To compare the clusters visually you can use plotDendroAndColors()
function from WGCNA
package. The function simply displays custom color information for each object under the dendrogram.
I cannot reproduce your example (the packages you used in your code are not specified), so I am demonstrating this using a simple clustering of iris
dataset:
library(WGCNA)
fit <- hclust(dist(iris[,-5]), method="ward")
groups <- cutree(fit, 3)
manmade <- as.numeric(iris$Species)
plotDendroAndColors(fit, cbind(clusters=labels2colors(groups), manmade=labels2colors(manmade)))
Since you are using some kind of third-party packages for clustering, you might have to first convert their objects to dendrograms for this plotting function to work. Maybe via:
fit <- as.dendrogram(hc1_wards)
plotDendroAndColors(fit, cbind(clusters=labels2colors(groups), manmade=labels2colors(manmade)))
Upvotes: 2