Reputation: 291
Creating a bare bones number line with the min, max, and median plotted along with the average for a second layer as a point on the line. I'd like to be able to change the color of the point depending on its relative position to the median.
Here is some example data:
df <- data.frame(
species = rep(c('dog','cat'),5),
trait = 'weight',
value = sample(5:25, 5))
df %>%
ggplot(aes(value,trait)) +
stat_boxplot(geom = "errorbar", width = 0.5) +
stat_summary(fun=median, geom="segment", aes(xend=..x.., yend=1.25, y = 0.75)) +
stat_summary(data = (df %>% filter(species == 'dog')), fun=mean,
geom="point", size=4,
shape = 15,
color = 'green') + # i'd like to change thhis color
theme_minimal() +
theme(line = element_blank(), text = element_blank())
Currently this is my plot where the point is always green. I'd like to change it to red if the point was less than the median.
Below is a version that works, but I am wondering if there is anything cleaner and more efficient? Ideally something where I can directly compare the x value of the point and directly compare to the x value of the segment created in the prior stat_summary layer.
something along the lines of this pseduocode
ifelse(point.x > segment.x, 'green', 'red')
current code:
color = ifelse(mean(test %>%
filter(species == 'dog') %>%
pull(value)) >
mean(test %>% pull(value)),
'green', 'red'))
Upvotes: 2
Views: 504
Reputation: 23737
You could make use of the relatively new after_stat
. That said, if you’re not quite sure of what happens in the computation of your Stat, the safer option would be to follow Jon’s idea.
I used Jon's data (thank you and +1 :)
df <- data.frame(
species = rep(c('dog','cat'),5),
trait = 'weight',
value = sample(5:25, 10))
df %>%
ggplot(aes(value,trait)) +
stat_boxplot(geom = "errorbar", width = 0.5) +
stat_summary(fun=median, geom="segment", aes(xend=..x.., yend=1.25, y = 0.75)) +
stat_summary(aes(color = after_stat(x) < median(x)), fun=mean,
geom="point", size=4,
shape = 15) +
Created on 2021-03-23 by the reprex package (v1.0.0)
Upvotes: 1
Reputation: 66480
df <- data.frame(
species = rep(c('dog','cat'),5),
trait = 'weight',
value = sample(5:25, 10)) # changed to make dif cat and dog #s
I'd suggest calculating the by-group mean and median before ggplot to make the comparison simple.
df_sum <- df %>%
group_by(species) %>%
summarize(grp_median = median(value),
grp_mean = mean(value))
df %>%
ggplot(aes(value,species)) +
stat_boxplot(geom = "errorbar", width = 0.5) +
geom_segment(data = df_sum,
aes(y = as.numeric(as.factor(species)) - 0.1,
yend = as.numeric(as.factor(species)) + 0.1,
x = grp_median, xend = grp_median), inherit.aes = F) +
geom_point(data = df_sum,
aes(grp_mean, species, color = grp_mean < grp_median)) +
scale_color_manual(values = c("green", "red"), guide = F) +
theme_minimal() +
theme(line = element_blank(), text = element_blank())
Upvotes: 2