Reputation: 376
Currently I am exploring kmeans
function. I have a simple text file (test.txt
) with the following entries. The data can be split into 2 clusters.
1
2
3
8
9
10
How to plot the results of kmeans
function ( using plot
function ) along with the original data? I am also interested in observing how the clusters are distributed along with their centroids?
Upvotes: 1
Views: 8980
Reputation: 8994
This is the example from example(kmeans)
:
# This is just to generate example data
test <- rbind(matrix(rnorm(100, sd = 0.3), ncol = 2),
matrix(rnorm(100, mean = 1, sd = 0.3), ncol = 2))
colnames(test) <- c("V1", "V2")
#store the kmeans in a variable called cl
(cl <- kmeans(test, 2))
# plot it and also plot the points of the centeroids
plot(test, col = cl$cluster)
points(cl$centers, col = 1:2, pch = 8, cex = 2)
Edit
OP has some additional questions:
(cl <- kmeans(test, 2))
plot(test, col = cl$cluster)
points(cl$centers, col = 1:2, pch = 8, cex = 2)
The above code results in:
(cl <- kmeans(test[,1], 2))
plot(test[,1], col = cl$cluster)
points(cl$centers, col = 1:2, pch = 8, cex = 2)
The above code results in:
(cl <- kmeans(test[,1], 2))
plot(cbind(0,test[,1]), col = cl$cluster)
points(cbind(0,cl$centers), col = 1:2, pch = 8, cex = 2)
The above code results in:
explained
In case 1 the data has two dimensions (V1, V2), so the centroids have two coordinates just as very point in the plot. In case 2 the data is one dimensional (V1) just like your data. R gives every point an index, and this results in x values being index values, the centroids also have only one coordinate thats why you see them all the way to the left of the plot. case 3 is what one dimensional data actually looks like if you plot it only in one dimension.
conclusion
Your data is one dimensional, if you plot it in two dimensions you get something like case two where x values are given by R, which are index values. Plotting it like that doesn't make much sense.
Upvotes: 3