Reputation: 11
This is my vector before kmeans -
> sort(table(mydata))
mydata
23 7 9 4 10 3 5 8 2 1
1 3 3 4 5 6 6 6 7 9
km <- kmeans(mydata, centers = 10)
After kmeans -
> sort(table(km$cluster))
km$cluster
1 6 7 3 5 2 4 10 8 9
1 3 3 4 5 6 6 6 7 9
Clearly, all my 1s are stored in cluster 9, all 2s are stored in Cluster 8 and so on.
Can I find using R which cluster a particular number belongs to? Say, finding which cluster my 1s are in?
Upvotes: 0
Views: 1306
Reputation: 60321
Extending on MrFlick's answer (upvoted), and in case you want the cluster number programmatically, you could do also this (utilizing the magrittr
package, to get rid of all these nested parentheses):
library(magrittr)
data.point <- 5 # put the data point here
cluster.no <- c(mydata==data.point) %>% which %>% km$cluster[.] %>% unique
Examples:
library(magrittr)
set.seed(42) # for reproducibility
mydata <- rep(c(23,7,9,4,10,3,5,8,2,1), c(1,3,3,4,5,6,6,6,7,9))
km <- kmeans(mydata, centers = 10)
data.point <- 23
c(mydata==data.point) %>% which %>% km$cluster[.] %>% unique
# 8
data.point <- 10
c(mydata==data.point) %>% which %>% km$cluster[.] %>% unique
# 1
Upvotes: 2
Reputation: 206167
The values for $cluster
are returned in the same order as your original data.
mydata <- rep(c(23,7,9,4,10,3,5,8,2,1), c(1,3,3,4,5,6,6,6,7,9))
sort(table(mydata))
# mydata
# 23 7 9 4 10 3 5 8 2 1
# 1 3 3 4 5 6 6 6 7 9
km <- kmeans(mydata, centers = 10)
unique(cbind(value=mydata, clust=km$cluster))
# value clust
# [1,] 23 9
# [2,] 7 5
# [3,] 9 7
# [4,] 4 4
# [5,] 10 1
# [6,] 3 10
# [7,] 5 2
# [8,] 8 8
# [9,] 2 6
# [10,] 1 3
Here i've just re-joined the two with cbind and used unique
to eliminate all the dups since you have such discrete data.
Upvotes: 5