R06
R06

Reputation: 47

Plotting of cluster datapoints between two columns of a dataframe

I have a data frame,in which, two of the columns are Age and Income. I have clustered the data Using Kmeans. Now I want to plot between Age and Income distinguishing the data points based on Clusters (By Colours)

df

Age    Income    Cluster
20      10000     1
30      20000     2
40      25000     1
50      20000     2
60      10000     3
70      15000     3

.

plot(df$Age,df$Income)

I want to plot the datapoints between Age and Income and Each datapoint should be coloured based on clusters

Upvotes: 0

Views: 458

Answers (3)

R06
R06

Reputation: 47

I found one Using plot function

df

Age Income

20 10000
30 20000
40 25000
50 20000
60 10000
70 15000

clust <- kmeans(df,centers = 3) # df without the last "Cluster" Column as in the Question

plot(df,col=clust$cluster, color=TRUE,las=1,xlab ="Age",ylab="Income") # df containing only Columns Age and Income. #Cluster is one of the components of Class Kmeans

If your data frame contains more than two Columns, subset it to the two Columns you want to plot between.

Upvotes: 0

rg255
rg255

Reputation: 4169

You could use ggplot() for this:

ggplot() +
  geom_point(mapping = aes(x = Age, y = Income, color = Cluster))

Here it is creating the aesthetics based on the values in the data (x position of the point is based on age, y position on the income, and colour of the point on the variable "cluster").

You could also add this using base R, here's an example using the mtcars dataset...

plot(x = mtcars$wt, y = mtcars$mpg, col = mtcars$cyl)

Upvotes: 1

S.Gradit
S.Gradit

Reputation: 180

try something like this :

library(ggplot2)

ggplot() + geom_point(data = df, aes(x = Age, y = Income, group = Cluster, color = Cluster)) 

Upvotes: 0

Related Questions