OSK
OSK

Reputation: 596

Kmean clustering in R: mapping clusters with data

For below code, I get the output for clustering of 'Sepal.Length,Sepal.Width', however i also wanted which data points belong to which cluster, how can it be done?

newiris <- iris
> newiris$Species <- NULL

> (kc <- kmeans(newiris, 3)) 
K-means clustering with 3 clusters of sizes 38, 50, 62

Cluster means:
  Sepal.Length Sepal.Width Petal.Length Petal.Width
1     6.850000    3.073684     5.742105    2.071053
2     5.006000    3.428000     1.462000    0.246000
3     5.901613    2.748387     4.393548    1.433871

Clustering vector:
  [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 [30] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 1 3 3 3 3 3
 [59] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 1 3 3 3 3 3 3 3 3 3
 [88] 3 3 3 3 3 3 3 3 3 3 3 3 3 1 3 1 1 1 1 3 1 1 1 1 1 1 3 3 1
[117] 1 1 1 3 1 3 1 3 1 1 3 3 1 1 1 1 1 3 1 1 1 1 3 1 1 1 3 1 1
[146] 1 3 1 1 3

Within cluster sum of squares by cluster:
[1] 23.87947 15.15100 39.82097

Available components:
[1] "cluster"  "centers"  "withinss" "size"   


> table(iris$Species, kc$cluster)

              1  2  3
  setosa      0 50  0
  versicolor  2  0 48
  virginica  36  0 14

> plot(newiris[c("Sepal.Length", "Sepal.Width")], col=kc$cluster)
> points(kc$centers[,c("Sepal.Length", "Sepal.Width")], col=1:3, pch=8, cex=2)

Upvotes: 0

Views: 217

Answers (2)

r.bot
r.bot

Reputation: 5424

As Thomas and Mamoun have said the cluster information is in kc$cluster, in the same order as the original observations. This can be added back to the original data set as below:

newiris <- cbind(newiris, cluster = kc$cluster)
head(newiris)
  Sepal.Length Sepal.Width Petal.Length Petal.Width cluster
1          5.1         3.5          1.4         0.2       1
2          4.9         3.0          1.4         0.2       1
3          4.7         3.2          1.3         0.2       1
4          4.6         3.1          1.5         0.2       1
5          5.0         3.6          1.4         0.2       1
6          5.4         3.9          1.7         0.4       1

Upvotes: 1

Thomas
Thomas

Reputation: 44527

You're already showing us the answer. kc$cluster is which cluster each observation was sorted into. It's printed by default and you can look at str(kc) to see what's returned by the kmeans function.

str(kc)
## List of 9
##  $ cluster     : int [1:150] 1 3 3 3 1 1 1 1 3 3 ...
##  $ centers     : num [1:3, 1:4] 5.18 6.31 4.74 3.62 2.9 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : chr [1:3] "1" "2" "3"
##   .. ..$ : chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
##  $ totss       : num 681
##  $ withinss    : num [1:3] 6.43 118.65 17.67
##  $ tot.withinss: num 143
##  $ betweenss   : num 539
##  $ size        : int [1:3] 33 96 21
##  $ iter        : int 2
##  $ ifault      : int 0
##  - attr(*, "class")= chr "kmeans"

Upvotes: 2

Related Questions