user3138373
user3138373

Reputation: 523

Scatter plot and clusters within it

I created a scatter plot using the ggplot2 package for my data. Since my data has a large number of points, I will explain my problem with already available small dataset. Consider this scatter plot:

ggplot(mtcars, aes(x=wt, y=mpg)) + geom_point()

Scatter plot between wt and mpg

I want to use k-means clustering to cluster these data points, but then also show the clusters on the same scatter plot (the one shown above) and not a new dimensionality reduction plot? How can I do this?

Upvotes: 1

Views: 1119

Answers (2)

TarJae
TarJae

Reputation: 78917

Here is an alternative using factoextra package:

library(factoextra)

df <- mtcars %>% 
  select(x = wt, y = mpg)


# Compute k-means with k = 3

set.seed(123)
res.km <- kmeans(scale(df[, -5]), 3, nstart = 25)

res.km$cluster

fviz_cluster(res.km, data = df[, -5],
             palette = c("steelblue", "gold", "limegreen"), 
             geom = "point",
             ellipse.type = "convex", 
             ggtheme = theme_bw()
)

enter image description here

Upvotes: 2

stefan
stefan

Reputation: 123928

One option would be to use ggforce::geom_mark_ellipse to draw some ellipses around your clusters:

library(ggplot2)
library(ggforce)

km.mtcars <- kmeans(scale(mtcars), centers = 3)

mtcars2 <- mtcars
mtcars2$cluster = km.mtcars$cluster

ggplot(mtcars2, aes(x=wt, y=mpg)) + 
  geom_point() +
  ggforce::geom_mark_ellipse(aes(fill = factor(cluster)))

Upvotes: 3

Related Questions