Gaurav Bansal
Gaurav Bansal

Reputation: 5660

R kmeans final distance to to centroid

I have run a kmeans algorithm on the iris dataset in R using the command kmeans_iris <- kmeans(iris[,1:4], centers=3). I now want to know the distance from a given observation in the iris dataset to its corresponding cluster's centroid. I could write code to manually calculate the Euclidean distance from an observation to the centers corresponding to its cluster, but is there not an easy, built-in way to do this?

Upvotes: 4

Views: 3880

Answers (1)

thelatemail
thelatemail

Reputation: 93813

As far as I can tell, there isn't a method for extracting the per case distance. If I understand what you want correctly, you could code your own like:

sqrt(rowSums((iris[,1:4] - fitted(kmeans_iris))^ 2))
# [1] 0.14135063 0.44763825 0.41710910 0.52533799 0.18862662 0.67703767...

...for a Euclidean distance.

You could clean this up into a function if you wanted, where you specify the original data and the fitted k-means output.

kmdist <- function(data,km) {
  sqrt(rowSums((data[,colnames(km$centers)] - fitted(km))^ 2))
}
kmdist(iris, kmeans_iris)
# [1] 0.14135063 0.44763825 0.41710910 0.52533799 0.18862662 0.67703767...

Upvotes: 4

Related Questions