Reputation: 47
I have a data frame,in which, two of the columns are Age and Income. I have clustered the data Using Kmeans. Now I want to plot between Age and Income distinguishing the data points based on Clusters (By Colours)
df
Age Income Cluster
20 10000 1
30 20000 2
40 25000 1
50 20000 2
60 10000 3
70 15000 3
.
plot(df$Age,df$Income)
I want to plot the datapoints between Age and Income and Each datapoint should be coloured based on clusters
Upvotes: 0
Views: 458
Reputation: 47
I found one Using plot function
df
Age Income
20 10000
30 20000
40 25000
50 20000
60 10000
70 15000
clust <- kmeans(df,centers = 3) # df without the last "Cluster" Column as in the Question
plot(df,col=clust$cluster, color=TRUE,las=1,xlab ="Age",ylab="Income") # df containing only Columns Age and Income. #Cluster is one of the components of Class Kmeans
If your data frame contains more than two Columns, subset it to the two Columns you want to plot between.
Upvotes: 0
Reputation: 4169
You could use ggplot()
for this:
ggplot() +
geom_point(mapping = aes(x = Age, y = Income, color = Cluster))
Here it is creating the aesthetics based on the values in the data (x position of the point is based on age, y position on the income, and colour of the point on the variable "cluster").
You could also add this using base R, here's an example using the mtcars dataset...
plot(x = mtcars$wt, y = mtcars$mpg, col = mtcars$cyl)
Upvotes: 1
Reputation: 180
try something like this :
library(ggplot2)
ggplot() + geom_point(data = df, aes(x = Age, y = Income, group = Cluster, color = Cluster))
Upvotes: 0