Samuel Perche
Samuel Perche

Reputation: 39

adding different columns to ggplot with for loop

I'm trying to plot some test errors for different models using ggplot but it seems the for loop is just replacing the last term everytime:

library(ggplot2)
test <- data.table('obs'= rep(0,10), 'exp'=rnorm(10) ,'pred1'=rnorm(10), 'pred2'=rnorm(10), 'pred3'=rnorm(10), date=1:10)
error_plot <- ggplot() + geom_point(aes(x=date, y = exp - obs), data = test, colour = "black") 

pred_names <- paste0('pred', 1:3)
colours_plot <- c('green', 'blue', 'yellow', 'purple')

for (i in 1:length(pred_names)){
  error_plot <- error_plot + geom_point(aes(x=date, y = get(pred_names[i]) - obs), data = test, colour = colours_plot[i])
  print(error_plot)
}

If i run without the loop, everything is fine:

error_plot <- error_plot + geom_point(aes(x=date, y = get(pred_names[1]) - obs), data = test, colour = colours_plot[1]) +
  geom_point(aes(x=date, y = get(pred_names[2]) - obs), data = test, colour = colours_plot[2]) +
  geom_point(aes(x=date, y = get(pred_names[3]) - obs), data = test, colour = colours_plot[3]) 

Upvotes: 1

Views: 68

Answers (1)

r2evans
r2evans

Reputation: 160417

Since the rendering is applied lazily, i is not resolved fully until the time it is rendered, at which point i has been changed. Fortunately, ggplot2 can add a list of geoms as well, so we can use lapply and family to create a list that is fully "realized". Depending on your preference of iteration styles, choose one of:

error_plot +
  lapply(seq_along(pred_names), function(i) {
    geom_point(aes(x = date, y = get(pred_names[i]) - obs),
               data = test, colour = colours_plot[i])
  })
## or ##
error_plot +
  Map(function(pn, cn) {
    geom_point(aes(x = date, y = get(pn) - obs), data = test, colour = cn)
  }, pred_names, colours_plot[1:3])

(Note that your pred_names is shorter than colours_plot, ergo the need for [1:3] in the Map-version.)

But perhaps a more ggplot2-canonical method would be to use long-data for your points, which allows fewer calls, optionally a legend (which I've disabled here), and several other things that aestheticized variables can accomplish:

testlong <- melt(test, id.vars = c("date", "obs", "exp"), variable.name = "pred")
testlong
#      date   obs         exp   pred       value
#     <int> <num>       <num> <fctr>       <num>
#  1:     1     0  0.43281803  pred1  0.27655075
#  2:     2     0 -0.81139318  pred1  0.67928882
#  3:     3     0  1.44410126  pred1  0.08983289
#  4:     4     0 -0.43144620  pred1 -2.99309008
# ---
# 27:     7     0 -0.78383894  pred3  0.25792144
# 28:     8     0  1.57572752  pred3  0.08844023
# 29:     9     0  0.64289931  pred3 -0.12089654
# 30:    10     0  0.08976065  pred3 -1.19432890
#      date   obs         exp   pred       value

ggplot(test) +
  geom_point(aes(x = date, y = exp - obs), colour = "black") + 
  geom_point(aes(x = date, y = value - obs, colour = pred), data = testlong) +
  scale_colour_manual(guide = FALSE, values = setNames(colours_plot[1:3], pred_names))

I think in this case we should not use testlong for the original geom_point, as that would triple-plot each of those points. One could always mitigate that with unique if you wanted to go all-in with a single frame.

Upvotes: 2

Related Questions