Reputation: 61
I would like to use ggplot2
to illustrate a fit using a restricted cubic spline using geom_smooth()
but it seems to be working incorrectly. Here is a short example:
# rms package Contains Restricted Cubic Splines (RCS)
library(rms)
library(ggplot2)
# Load Data
data(cars)
# Model Fit with RCS
fit <- lm(speed ~ rcs(dist, 5), data=cars)
# Obtain Diagnostic Data
plot.dat <- cbind(cars, fitted=fitted(fit))
# Compare Smooth to Actual
ggplot(data=plot.dat) +
geom_point(aes(x=dist, y=speed)) +
geom_smooth(aes(x=dist, y=speed), method="lm",
formula=y ~ rcs(x, 5), se=FALSE, colour="blue") +
geom_line(aes(y=fitted, x=dist), size=1.25, colour="red")
This results in the following image:
Comparison of Splines
I am not sure why geom_smooth()
is not giving the correct results. Clearly there is a work-around (as illustrated above), but is there a way to make geom_smooth()
produce the correct results?
Upvotes: 3
Views: 2100
Reputation: 61
It has been a long time, but I finally recognized the problem, and I thought I would post it here for those interested. Internally, geom_smooth()
will create a sequence of the predictor at which to plot the predicted response. As this sequence is spaced out across the range of the x-axis, the knot points selected by rcs()
inside of geom_smooth()
will differ from the knot points selected by rcs()
on the original data. To address this, you need to pass in the correct knot points.
# rms package Contains Restricted Cubic Splines (RCS)
library(rms)
library(ggplot2)
# Load Data
data(cars)
# Model Fit with RCS
fit <- lm(speed ~ rcs(dist, 5), data=cars)
# Obtain Diagnostic Data
plot.dat <- cbind(cars, fitted=fitted(fit))
# Compare Smooth to Actual
ggplot(data=plot.dat) +
geom_point(aes(x=dist, y=speed)) +
geom_smooth(aes(x=dist, y=speed), method="lm",
formula=y ~ rcs(x, quantile(plot.dat$dist, probs = c(0.05, 0.275, 0.5, 0.725, 0.95))), se=FALSE, colour="blue") +
geom_line(aes(y=fitted, x=dist), colour="red")
Upvotes: 3
Reputation: 263331
I don't know how to integrate this with geom_smooth
but I can do it with ggplot.Predict
from the rms package:
ddist <- datadist(cars)
options(datadist='ddist')
fit <- ols(speed~ rcs(dist,5),data=cars,
x=TRUE, y=TRUE)
ggplot(Predict(fit))+geom_point(data=cars, aes(x=dist, y=speed))
Upvotes: 3