Michael Clauss
Michael Clauss

Reputation: 79

Selecting color to mark means on boxplot with stat_summary

So there's several useful pages up about marking means on boxplots with multiple series; but even with those I'm having an issue where I can't select a color for the points and still show the two different means. I can do this:

library(ggplot2)
d <- subset(mpg,class=="compact"|class=="midsize")
ggplot(d,aes(drv,hwy,color=class)) + geom_boxplot() + scale_color_manual(values=c("blue","orange")) +
  stat_summary(fun=mean,size=.5,shape=5,position=position_dodge(width=.75))

And that gives me the two different means, but they're the same color as the boxplots themselves and so not the best to look at.

First attempt at boxplot, means distinguished

So I add a color specification into the code:

ggplot(d,aes(drv,hwy,color=class)) + geom_boxplot() + scale_color_manual(values=c("blue","orange")) +
  stat_summary(fun=mean,size=.5,color="black",shape=5,position=position_dodge(width=.75))

But then it's only showing the one mean.

second attempt, means not distinguished

So what am I missing here to get both a specified color and the multiple means being marked?

Upvotes: 1

Views: 816

Answers (2)

YBS
YBS

Reputation: 21297

Using fill to color the box, and color for stat_summary you get the desired output.

ggplot(d,aes(drv,hwy, fill=class)) + geom_boxplot() + scale_fill_manual(values=c("cyan","orange")) +
    stat_summary(fun=mean,size=.5, color="red", 
                 shape=5,position=position_dodge(width=.75))

output

Upvotes: 2

Mikko Marttila
Mikko Marttila

Reputation: 11878

When you overwrite the colour aesthetic in stat_summary() you also lose the grouping information. You need to bring it back explicitly with aes(group = class):

library(ggplot2)

d <- subset(mpg, class == "compact" | class == "midsize")

ggplot(d, aes(drv, hwy, color = class)) +
  geom_boxplot() +
  stat_summary(
    aes(group = class),
    colour = "black",
    fun = mean,
    size = .5,
    shape = 5,
    position = position_dodge(width = .75)
  )
#> Warning: Removed 4 rows containing missing values (geom_segment).

Upvotes: 2

Related Questions