extract groups from distance matrix using cutree in R

Question

I started with a list of hobbies and people, I wanted to cluster those people by their common hobbies. So, I created a distance matrix then I applied the hierarchal clustering and cutree to group the clustering into specific number of cluster. Now I have the cutree matrix but I do not know how to extract the clusters from it. Would you please advice?

Here is an example of what I mean.

The distance matrix:

       one    three   two
one     0      1.0    1.0
three   1      0.0    0.5
two     1      0.5    0.0

Then I used the hclust and cutree and got this result:

hc <- hclust(dist, method="ward")
ct <- cutree(hc, k=1:3)
        1       2      3
one     1       1      1
three   1       2      2
two     1       2      3

How do I get a list of people that belong in the same cluster?

Thank you for your help.

AdamO · Accepted Answer

Your k=1:3 will provide the predicted cluster for each of $k = {1, 2, 3}$. If you want to bundle groups according to cluster, assume WLOG that 2 is the number of clusters you're interested in, you simple need to concatenate the names of the matrix column by the matrix column entries.

Example:

hc <- hclust(dist(USArrests))
memb <- cutree(hc, k = 1:5)
tapply(names(memb[, 3]), memb[, 3], c) ## say we're interested in 3 clusters

extract groups from distance matrix using cutree in R

Answers (2)

Related Questions