user4575913
user4575913

Reputation: 567

How to add shaded confidence intervals to line plot with specified values

I have a small table of summary data with the odds ratio, upper and lower confidence limits for four categories, with six levels within each category. I'd like to produce a chart using ggplot2 that looks similar to the usual one created when you specify a lm and it's se, but I'd like R just to use the pre-specified values I have in my table. I've managed to create the line graph with error bars, but these overlap and make it unclear. The data look like this:

interval    OR  Drug    lower   upper
14  0.004   a   0.002   0.205
30  0.022   a   0.001   0.101
60  0.13    a   0.061   0.23
90  0.22    a   0.14    0.34
180 0.25    a   0.17    0.35
365 0.31    a   0.23    0.41
14  0.84    b   0.59    1.19
30  0.85    b   0.66    1.084
60  0.94    b   0.75    1.17
90  0.83    b   0.68    1.01
180 1.28    b   1.09    1.51
365 1.58    b   1.38    1.82
14  1.9 c   0.9 4.27
30  2.91    c   1.47    6.29
60  2.57    c   1.52    4.55
90  2.05    c   1.31    3.27
180 2.422   c   1.596   3.769
365 2.83    c   1.93    4.26
14  0.29    d   0.04    1.18
30  0.09    d   0.01    0.29
60  0.39    d   0.17    0.82
90  0.39    d   0.2 0.7
180 0.37    d   0.22    0.59
365 0.34    d   0.21    0.53

I have tried this:

limits <- aes(ymax=upper, ymin=lower)
dodge <- position_dodge(width=0.9)
ggplot(data, aes(y=OR, x=days, colour=Drug)) + 
  geom_line(stat="identity") + 
  geom_errorbar(limits, position=dodge)

and searched for a suitable answer to create a pretty plot, but I'm flummoxed!

Any help greatly appreciated!

Upvotes: 30

Views: 71910

Answers (2)

Kodiakflds
Kodiakflds

Reputation: 623

Here is a base R approach using polygon() since @jmb requested a solution in the comments. Note that I have to define two sets of x-values and associated y values for the polygon to plot. It works by plotting the outer perimeter of the polygon. I define plot type = 'n' and use points() separately to get the points on top of the polygon. My personal preference is the ggplot solutions above when possible since polygon() is pretty clunky.

library(tidyverse)

data('mtcars')  #built in dataset

mean.mpg = mtcars %>% 
  group_by(cyl) %>% 
  summarise(N = n(),
        avg.mpg = mean(mpg),
        SE.low = avg.mpg - (sd(mpg)/sqrt(N)),
        SE.high =avg.mpg + (sd(mpg)/sqrt(N)))


plot(avg.mpg ~ cyl, data = mean.mpg, ylim = c(10,30), type = 'n')

#note I have defined c(x1, x2) and c(y1, y2)
polygon(c(mean.mpg$cyl, rev(mean.mpg$cyl)), 
c(mean.mpg$SE.low,rev(mean.mpg$SE.high)), density = 200, col ='grey90')

points(avg.mpg ~ cyl, data = mean.mpg, pch = 19, col = 'firebrick')

Upvotes: 3

Ruthger Righart
Ruthger Righart

Reputation: 4921

You need the following lines:

p<-ggplot(data=data, aes(x=interval, y=OR, colour=Drug)) + geom_point() + geom_line()
p<-p+geom_ribbon(aes(ymin=data$lower, ymax=data$upper), linetype=2, alpha=0.1)

enter image description here

Upvotes: 58

Related Questions