Jonathan Gellar
Jonathan Gellar

Reputation: 325

Create a ggplot2 geom for a line and confidence interval

I would like to design a geom to plot a line with a confidence interval around it. The data frame that this will be based on contains the following:

  1. The x values
  2. The y values of the main line at each x value
  3. The standard error of y=f(x) for each x

For example,

xvals <- seq(0,2*pi,length=100)
df <- data.frame(x=xvals, y=sin(xvals), se=.25)
head(df)
     x           y   se
1 0.00 0.000000000 0.25
2 0.01 0.009999833 0.25
3 0.02 0.019998667 0.25
4 0.03 0.029995500 0.25
5 0.04 0.039989334 0.25
6 0.05 0.049979169 0.25

I followed the guidelines laid out here to write the following geom function:

geom_myci <- function(yvar, sevar) {
  list(geom_ribbon(mapping=aes_q(ymin=substitute(yvar-1.96*sevar),
                                 ymax=substitute(yvar+1.96*sevar)),
                   colour="lightgrey", fill="lightgrey"),
       geom_line(mapping=aes_q(y=substitute(yvar)), lwd=2),
       theme_bw())
}

This can be then executed using:

ggplot(df, aes(x,y)) + geom_myci(y,se)

This works great, the only thing I don't like is I make the user enter the y variable twice. Is there any way, within the geom function, to know the variable that is already mapped to "y"?

Upvotes: 2

Views: 2572

Answers (1)

Rentrop
Rentrop

Reputation: 21507

You can do this via aes_string with substitute of the se-Variable as follows - Have a look at http://adv-r.had.co.nz/Functions.html to see how substitute and lazy evaluation works

geom_myci <- function(se_var = se) {
  se_var <- as.character(substitute(se_var))
  list(geom_ribbon(aes_string(ymin = sprintf("y - 1.96 * %s", se_var),
                              ymax = sprintf("y + 1.96 * %s", se_var)),
                   colour="lightgrey", fill="lightgrey"),
       geom_line(lwd=2),
       theme_bw())
}

as.character(substitute(se_var)) makes se_var a sting. In this example "se". Now you can use this to build the aes_string via sprintf("y - 1.96 * %s", se_var) which results in
"y - 1.96 * se" - the string we need.

Plotting it results (as asked for) in

ggplot(df, aes(x,y)) + geom_myci()

enter image description here

Upvotes: 1

Related Questions