greg dubrow
greg dubrow

Reputation: 633

Adding labels in ggplot for summary statistics

About 18 months ago, this helpful exchange appeared, with code to show how to produce a plot of median along with interquartile ranges. Here's the code:

ggplot(data = diamonds) +
geom_pointrange(mapping = aes(x = cut, y = depth),
              stat = "summary",
              fun.ymin = function(z) {quantile(z,0.25)},
              fun.ymax = function(z) {quantile(z,0.75)},
              fun.y = median)

Producing this plot:

this plot

What I'd wonder is how to add labels for the median and IQ ranges, and how to format the bar (color, alpha, etc). I tried calling the plot as an object to see if there were objects within I could then use to call format functions, but nothing was obvious when I looked at it in the r Studio IDE.

Is this even doable? I know I can do a boxplot but that would have to include min/max. I'd like to produce boxplots with just mean/median and IQs.

Upvotes: 1

Views: 787

Answers (1)

G_T
G_T

Reputation: 1587

You can change the formating like you would any ggplot layer, see the docs for Vertical intervals: lines, crossbars & errorbars in this case. An example of this is the following:

library(ggplot2)
ggplot(data = diamonds) +
  geom_pointrange(mapping = aes(x = cut, y = depth),
                  stat = "summary",
                  fun.ymin = function(z) {quantile(z,0.25)},
                  fun.ymax = function(z) {quantile(z,0.75)},
                  fun.y = median,
                  size = 4,             # <- adjusts size
                  colour = "red",       # <- adjusts colour
                  alpha = .3)           # <- adjusts transparency

enter image description here

If you want to control formatting for the points and lines individually you need to do as @camille suggests and pre-process your data as geom_pointrange() draws a single graphical object so the points and lines are one in the same.

I would suggest something like this:

library(dplyr)
library(ggplot2)

diamonds %>% 
  group_by(cut) %>% 
  summarise(median = median(depth),
            lq = quantile(depth, 0.25),
            uq = quantile(depth, 0.75)) %>% 

  ggplot(aes(cut, median)) +
  geom_linerange(aes(ymin=lq, ymax=uq), size = 4, colour = "blue", alpha = .4) +
  geom_point(size = 10, colour = "red", alpha = .8)

enter image description here

Upvotes: 1

Related Questions