Reputation: 16081
In the R programming environment, I am currently using the standard implementation of the kmeans
algorithm (type: help(kmeans)
). It appears that I cannot initialize the starting centroids. I specify the kmeans
algorithm to give me 4 clusters and I would like to pass the vector coordinates of the starting centroids.
kmeans
to allow me to pass initial centroid coordinates?Upvotes: 3
Views: 3960
Reputation: 61913
Yes. The implementation you mention allows you to specify starting positions. You pass them in through the centers
parameter
> dat <- data.frame(x = rnorm(99, mean = c(-5, 0 , 5)), y = rnorm(99, mean = c(-5, 0, 5)))
> plot(dat)
> start <- matrix(c(-5, 0, 5, -5, 0, 5), 3, 2)
> kmeans(dat, start)
K-means clustering with 3 clusters of sizes 33, 33, 33
Cluster means:
x y
1 -5.0222798 -5.06545689
2 -0.1297747 -0.02890204
3 4.8006581 5.00315151
Clustering vector:
[1] 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2
[51] 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3
Within cluster sum of squares by cluster:
[1] 58.05137 73.81878 52.45732
(between_SS / total_SS = 94.7 %)
Available components:
[1] "cluster" "centers" "totss" "withinss" "tot.withinss" "betweenss"
[7] "size"
Upvotes: 6