Augusto Umaña
Augusto Umaña

Reputation: 3

Using ggplot, How could I frame points in a scatter plot that belong to a group?

In my data I have classified each observation to a group. I would like to plot the data in a scatter plot and frame each group to show the difference. As far as I have found there are several functions in R that do that if you have clusteres data with Kmens, pam or clara (for example autoplot() in the package ggfortify, fviz_cluster() in factoextra package or clusplot() in cluster package). As an example I use the iris data set and the package ggfortify.

library(ggplot2)
library(ggfortify)

#dimension reduction
IrisPrinComp <- princomp(formula=~Sepal.Length +Sepal.Width +Petal.Length +Petal.Width, data = iris )

IrisWithPrinComp <- cbind(iris, IrisPrinComp$scores)

IrisWithPrinComp$FactorSpecies <- factor(IrisWithPrinComp$Species)

#Plot
ggplot(data=IrisWithPrinComp, aes(x=Comp.1, y=Comp.2, color=FactorSpecies))+geom_point()

Iris principal components scatter plot Using ggfortify I get almost what I want, except that the clusters misclassify some observations:

Irisclusters <- kmeans(x = IrisWithPrinComp[,c("Comp.1", "Comp.2")], centers = 3, iter.max = 20, nstart = 5)

IrisWithPrinComp$Cluster <- factor(Irisclusters$cluster)

autoplot(Irisclusters, data = IrisWithPrinComp, frame=T)

Framed kmeans scatter plot

Upvotes: 0

Views: 865

Answers (1)

yeedle
yeedle

Reputation: 5008

You can use geom_encircle from the ggalt package


library(dplyr)
library(ggplot2)
library(tibble)
# devtools::install_github("hrbrmstr/ggalt")
library(ggalt)

princomp(formula = ~Sepal.Length + Sepal.Width + Petal.Length + Petal.Width, 
         data = iris) %>%
  fortify() %>%
  add_column(FactorSpecies = factor(iris$Species)) %>%
  ggplot(aes(x = Comp.1, y = Comp.2, color = FactorSpecies)) + 
  geom_point() +
  geom_encircle(aes(group = FactorSpecies))

Upvotes: 3

Related Questions