Reputation: 407
I have a dataset with multiple groups that I would like to leave with colors by group, but from some points, based on a separate factor meeting a condition, I would like to change the colors of these points only in each group. I have run into the issue where if I try to specify changes by a second factor, then the grouped data get split by that factor, which is not what I want. I have replicated the approach below:
library(ggplot2)
library(dplyr)
df <- mtcars %>% mutate(cyl = as.factor(cyl), vs = as.factor(vs))
ggplot(df, aes(x=cyl, y=mpg, col=cyl, size=drat)) +
geom_boxplot() +
geom_jitter(position=position_jitter(0.2))
Base Figure This gives the basic plot I'm looking for, but now I'd like to recolor some of these points using a second factor (here using "vs").
ggplot(df, aes(x=cyl, y=mpg, col=cyl, size=drat)) +
geom_boxplot() +
geom_jitter(position=position_jitter(0.2)) +
geom_jitter(data = . %>% filter(vs == 1), col = "purple",
position=position_jitter(0.2))
Or
ggplot(df) +
geom_boxplot(aes(x=cyl, y=mpg, col=cyl, size=drat)) +
geom_jitter(aes(x=cyl, y=mpg, col=cyl, size=drat),
position=position_jitter(0.2)) +
geom_jitter(data = . %>% filter(vs == 1),
aes(x=cyl, y=mpg, size=drat),
col = "purple",
position=position_jitter(0.2))
These two overlay approaches do the same thing, they almost work but because the dataset is filtered to only include the vs == 1 the jitter points aren't in the same place as the full dataset. Overlay Plot
Workarounds? Maybe a better approach?
Upvotes: 0
Views: 30
Reputation: 37903
qdread's suggestion in the comments is also good, but alternatively you can simply precompute a jitter and add this to the plot's x-position. Since you're working with factors that internally get converted to integers at the boxplot layer, you'd have to use as.numeric()
for the factor for the point layers, otherwise it complains about combining factors with numerics.
library(ggplot2)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
df <- mtcars %>% mutate(cyl = as.factor(cyl), vs = as.factor(vs))
df$jitter <- runif(nrow(df), min = -0.2, max = 0.2)
ggplot(df) +
geom_boxplot(aes(x=cyl, y=mpg, col=cyl, size=drat)) +
geom_point(aes(x=as.numeric(cyl) + jitter, y=mpg, col=cyl, size=drat)) +
geom_point(data = . %>% filter(vs == 1),
aes(x=as.numeric(cyl) + jitter, y=mpg, size=drat),
col = "purple")
Created on 2020-12-04 by the reprex package (v0.3.0)
Upvotes: 1