Reputation: 95
I have a range of values and I want to identify the cluster with the lowest values using kmeans. However the cluster labels seem to be sorted in a different way then I was looking for.
test <- c(1,4,5,12,17,18,33,34)
cl <- kmeans(test, centers = 3, nstart =10)
cl$cluster
[1] 2 2 2 1 1 1 3 3
# whereas I would have expected to get
[1] 1 1 1 2 2 2 3 3
How can I sort the output from kmeans in the way that I want?
Upvotes: 0
Views: 695
Reputation: 37641
There is no guarantee that low numbers will be grouped with other low numbers and you do not say precisely how you want the clusters ordered. Here is one way; you can order the clusters by the lowest point index in the cluster. That will produce the result that you asked for on this test data.
MT = aggregate(test, list(cl$cluster), min)
MT$Group.1[order(MT$x)[cl$cluster]]
[1] 1 1 1 2 2 2 3 3
If you want to propagate this change to cl
you can just make the assignement
cl$cluster = MT$Group.1[order(MT$x)[cl$cluster]]
Upvotes: 1