Yiyang
Yiyang

Reputation: 39

For hierarchical clustering, how to find the "center" in each cluster in R

I know how to find the center in each cluster in K-means Clustering. But for hierarchical clustering, I am not sure how to do that in R. Here are my codes: first, I made my distance matrix using DTW Distance, and input it into R:

DTW<-read.csv(file.choose(), head=T, row.names=1)
DTWS2N <- as.dist(as(DTW, "matrix"))

Then, I did the hierarchical clustering (K=10):

hc <- hclust(DTWS2N)
plot(hc)
groups <- cutree(hc, k=10)
rect.hclust(hc, k=10, border="red")

I can also look at the elements in each cluster by:

d = data.frame(Cluster_ID = cutree(hc,k=10))

Right now, I want to look at the "center"(the one who has the smallest distance to others within the cluster) in each cluster, I cannot find the R code, someone can help me? Thank you very much!

Upvotes: 0

Views: 2008

Answers (1)

Neal Fultz
Neal Fultz

Reputation: 9687

Following on from the example from ?hclust:

data(UScitiesD)
mds2 <- -cmdscale(UScitiesD)
hcity.D2 <- hclust(UScitiesD, "ward.D2")

You might calculate the distance by cluster, then find the point with the lowest mean distance for each cluster. You can compose all of that into an anonymous function:

lapply(by(mds2, cutree(hcity.D2, 4), dist), 
  function(x) which.min(colMeans(as.matrix(x))))
$`1`
Washington.DC 
            4 

$`2`
Denver 
     1 

$`3`
SanFrancisco 
           2 

$`4`
Miami 
    1 

Upvotes: 1

Related Questions