tomka
tomka

Reputation: 2638

How to add legend information for the meaning of points added to a boxplot in ggplot2 in R?

I consider the following type of grouped boxplot where a point is added for each group with some additional information (here the variance). How can I add in the legend information/annotation that says that the triangles denote the variance?

The aspired result is a legend that first distinguishes the boxplot groups "g2" (red is 0, blue is 1; this is now already in the plot) and then has an extra row/section denoting triangle symbols as the group specific variances (this is missing and needed).

library(ggplot2)
n=1000
dat = data.frame(    g1 = as.factor(rbinom(n,1,0.5)), 
                     g2 = as.factor(rbinom(n,1,0.5)))
dat$x = rnorm(n, as.numeric(dat$g1)+ as.numeric(dat$g2) , as.numeric(dat$g1) + as.numeric(dat$g2))
dat.var = aggregate(x ~ g1 + g2, data = dat, var)

ggplot(dat,aes(x=g1, y=x, fill=g2)) + 
  geom_boxplot(outlier.size=0.5)    +
  geom_point(data = dat.var, aes(x=g1, y=x, group=g2), pch=17, col="black",
             position=position_dodge(width=0.75), size = 3)

enter image description here

Upvotes: 1

Views: 286

Answers (2)

Louis
Louis

Reputation: 3632

Use labs function

You can use scale_shape_manual function and remove redundant aes mappings like this:

library(ggplot2)
library(dplyr)
dat %>% 
  ggplot(aes(x = g1, y = x)) +
  geom_boxplot(aes(fill = g2), outlier.size = 0.5) +
  geom_point(
    data = dat.var,
    aes(group = g2, pch = 'variance'),
    col = "black",
    position = position_dodge(width = 0.75),
    size = 3
  ) +
  scale_shape_manual(values = c('variance' = 17), name = 'Vars')

enter image description here

Hope this helps.

Upvotes: 1

Max Teflon
Max Teflon

Reputation: 1800

Something like this?

library(ggplot2)
n=1000
dat = data.frame(    g1 = as.factor(rbinom(n,1,0.5)), 
                     g2 = as.factor(rbinom(n,1,0.5)))
dat$x = rnorm(n, as.numeric(dat$g1)+ as.numeric(dat$g2) , as.numeric(dat$g1) + as.numeric(dat$g2))
dat.var = aggregate(x ~ g1 + g2, data = dat, var)

ggplot(dat,aes(x=g1, y=x, fill=g2)) + 
  geom_boxplot(outlier.size=0.5)    +
  geom_point(data = dat.var, aes(x=g1, y=x, group=g2,pch = 'variance'), col="black",
             position=position_dodge(width=0.75), size = 3) +
  scale_shape_manual(values = c('variance' = 17), name = ' ')

The pch = 'variance'-part in the point-layer sets the name, the scale_shape_manual sets the desired shape and the name = ' '-part removes the (duplicated) 'variance'-title.

Upvotes: 1

Related Questions