user3206440
user3206440

Reputation: 5049

Add legend to indicate shapes

Need some help with adding legend for shapes used in the plot as described below. The plot is as below - its a box plot, points for means, error bars for confidence interval.

The resulting plot is as below - how do I add a legend to this so as to tell that the red circles indicate the mean and the green error bars indicate confidence interval ? - like in the image below

Required legend

Legend

Plot Box plot with mean & ci

The data and code used to generate the above is given below for reference.

df <- data.frame(cbind(mtcars[,1], mtcars[,2])) #mtcars[, 1:2]
colnames(df) <- c("metric", "group")
df$group <- factor(df$group)

p1 <- ggplot(data=df, aes(x=group, y=metric ) ) +
  geom_boxplot()

metric_means <- aggregate(df$metric, list(df$group), mean) 
metric_ci_95 <- aggregate(df$metric, list(df$group), function(x){1.96*sd(x)/sqrt(length(x))})
metric_mean_ci = data.frame(group=metric_means[,1],mean=metric_means[,2], ci=metric_ci_95[,2])

# plot mean
p1 <- p1 + geom_point(data=metric_means, aes(x=metric_means[,1], y=metric_means[,2]),
                      colour="red", shape=21, size=2)

#plot confidence interval
p1 <- p1 + geom_errorbar(data=metric_mean_ci, aes(ymin=mean-ci, ymax=mean+ci, x=group, y=mean),
                         color="green", width=.1)

p1

What needs to be added to the above code so as to get the legend that reveal the stat summary that the circle and error bar shapes indicate?

Upvotes: 4

Views: 1549

Answers (1)

Mark Peterson
Mark Peterson

Reputation: 9560

If you really want to color them separately, you can use this code. I am using geom_linerange instead of geom_errorbar to get a vertical line in the legend. In addition, as suggested, I am mapping colors inside of aes to get the legend, and then I am using override.aes to limit what plots for each of the values.

ggplot(data=df, aes(x=group, y=metric ) ) +
  geom_boxplot() +
  geom_point(data=metric_means
             , aes(x=metric_means[,1]
                   , y=metric_means[,2]
                   , colour = "Mean")
             , shape=21, size=2) +
  geom_linerange(data=metric_mean_ci
                 , aes(ymin=mean-ci
                      , ymax=mean+ci
                      , x=group
                      , y=mean
                      , color="95% CI")
                ) +
  scale_color_manual(name = "", values = c("green", "red")) +
  guides(colour = guide_legend(override.aes = list(linetype = c("solid", "blank")
                                                   , shape = c(NA, 1))))

Gives:

enter image description here

An alternative, which would require less complicated set up, is to use some of the functions already available to you, specifically, stat_summary:

ggplot(data=df
       , aes(x=group, y=metric ) ) +
  geom_boxplot() +
  stat_summary(
    aes(color = "Mean and 95% CI")
    , fun.data = mean_cl_normal
    )

Gives:

enter image description here

Upvotes: 3

Related Questions