Reputation: 358
Suppose I am trying to generate prediction intervals for two sets of scores, X and Y:
set.seed(1111)
n = 1000
x1 = rnorm(n)
x2 = .5*x1 + rnorm(n, 0, sqrt(1-.25))
x_mod = lm(x2~x1)
x_se = predict(x_mod, interval="prediction", level=.68, se.fit=TRUE)$se.fit
y1 = .4*x1 + rnorm(n, sqrt(1-.16))
y2 = .7*y1 + rnorm(n, 0, sqrt(1-.49))
y_mod = lm(y2~y1)
y_se = predict(y_mod, interval="prediction", level=.68, se.fit=TRUE)$se.fit
Now what I want to do is plot the predicted values of X2 and Y2, but want to visually represent my uncertainty. One way to do this is with an ellipse, rather than a point. However, when I plot an ellipse, it generates one ellipse for the entire scatterplot, rather than an ellipse for each point:
d = data.frame(x1,x2,x2_pred = predict(x_mod), x_se,
y1,y2,y2_pred = predict(y_mod), y_se)
require(ggplot2)
ggplot(data=d, aes(x2_pred, y2_pred)) +
stat_ellipse(mapping=aes(x2_pred, y2_pred))
Does anyone know of a way to do a separate ellipse for each point?
Also, I'm open to other ideas for how to represent this uncertainty. (A point with a gradient of color, perhaps?)
Upvotes: 0
Views: 329
Reputation: 3542
The package ggforce
provides a geom_ellipse
:
library(ggforce)
ggplot(data=d, aes(x2_pred, y2_pred)) +
geom_ellipse(aes(x0 = x2_pred, y0 = y2_pred, a = x_se, b = y_se, angle = 0))
Another option is to use error bars to plot the points, with or without points...
ggplot(data=d, aes(x2_pred, y2_pred)) +
# geom_point(alpha=0.2) +
geom_errorbar(aes(ymin=y2_pred-y_se, ymax=y2_pred+y_se)) +
geom_errorbarh(aes(xmin=x2_pred-x_se, xmax=x2_pred+x_se))
This approach nicely shows that the error is smallest close to the means for both x and y, and grows in the appropriate direction farther away. You could play around with themes and alpha
to get something that looks nicer. The second looks a little cleaner to me, but it depends on the message you're trying to send.
Upvotes: 2