codegoblin1996
codegoblin1996

Reputation: 302

Plotting prediction intervals for mixed effects model

I have implemented a mixed effects model for my experiment for how error rate affects reaction time. I now want to calculate prediction intervals and then plot them.

Here is an example of my df

    ppid error_rate      RT pNum
1   1_1_4   2.865371 0.43339    1
2  1_1_77  11.459301 0.45000    1
3  1_1_80   2.865371 0.38320    1
4  1_2_26   3.820155 0.49990    1
5  1_2_31   2.865371 0.56680    1
6  1_2_32   3.820155 0.58330    1
7  1_2_33   2.865371 0.50000    1
8  1_2_40   3.820155 0.44980    1
9  1_2_43   2.865371 0.56660    1
10 1_2_54  11.459301 0.46670    1
11 1_2_63   2.865371 0.43350    1
12 1_2_64   2.865371 0.46680    1
13 1_2_71   2.865371 0.54990    1
14 1_2_76   2.865371 0.48350    1
15 1_2_85   2.865371 0.53340    1
16 1_2_88   3.820155 0.43340    1
17 1_2_89   3.820155 0.53320    1
18  1_3_0   3.820155 0.45080    1
19  1_3_1   2.865371 0.45022    1
20 1_3_19   2.865371 0.46651    1

I then implement the mixed effects model, generate some prediction intervals for each data point and then combine my original data with the predictions:

library(lme4)
library(merTools)
library(ggplot2)

fit <- lmer(formula = RT ~ error_rate + (1 + error_rate | pNum), data = data)

pred <- cbind(data, predictInterval(fit, data))

I then plot this using ggplot and get the following plot:

ggplot(pred) + 
  geom_line(aes(x = error_rate, y = fit)) +
  geom_ribbon(aes(x = error_rate, ymin = lwr, ymax = upr), alpha = .2) +
  geom_jitter(aes(x = error_rate, y = RT), alpha = .1) +
  ylab("RT")

image of the resulting ggplot code

My plot makes sense to me: I have the black line indicating the predicted values for each error rate, and a shaded area which denotes the intervals. However I'm unsure why I'm getting the straight vertical lines in the middle of each error rate level within my data points? Also my horizontal prediction line seems wonky... does anybody know why this might be, and how to eradicate it? Many thanks!

Upvotes: 2

Views: 969

Answers (1)

Rui Barradas
Rui Barradas

Reputation: 76412

One way to have a line connecting the error_rate values without the vertical lines, is to plot mean values of the y variable fit. This is done with stat_summary as below.

ggplot(pred, aes(x = error_rate, y = fit)) + 
  stat_summary(fun.y = mean, geom = "line", show.legend = FALSE) + 
  geom_ribbon(aes(x = error_rate, ymin = lwr, ymax = upr), alpha = 0.2) +
  geom_jitter(aes(x = error_rate, y = RT), alpha = 0.1) +
  ylab("RT")

enter image description here

Note: In the question code, the ribbon is plotted with alpha = 0.2 and the points with alpha = 0.1. Would it make more sense to have the points less transparent than the underlying prediction band? And therefore to swap the alpha values?

Upvotes: 2

Related Questions