Reputation: 4517
I have a fitted a simple natural spline (df = 3) model and I'm trying to predict for some out of sample observations. Using the function predict()
, I'm able to get fitted values for in-sample observations but I've not been able to get the predicted value for new observations.
Here is my code:
library(splines)
set.seed(12345)
x <- seq(0, 2, by = 0.01)
y <- rnorm(length(x)) + 2*sin(2*pi*(x-1/4))
# My n.s fit:
fit.temp <- lm(y ~ ns(x, knots = seq(0.01, 2, by = 0.1)))
# Getting fitted values:
fit.temp.values <- predict(fit.temp,interval="prediction", level = 1 - 0.05)
# Plotting the data, the fit, and the 95% CI:
plot(x, y, ylim = c(-6, +6))
lines(x, fit.temp.values[,1], col = "darkred")
lines(x, fit.temp.values[,2], col = "darkblue", lty = 2)
lines(x, fit.temp.values[,3], col = "darkblue", lty = 2)
# Consider the points for which we want to get the predicted values:
x.new <- c(0.275, 0.375, 0.475, 0.575, 1.345)
How can I get the predicted values for x.new?
Thanks very much for your help,
p.s. I searched all related questions on SO and I didn't find the answer.
Upvotes: 3
Views: 10667
Reputation: 57696
Create a data frame with a column called x
, and pass it as the newdata
argument to predict
:
predict(fit.temp, newdata=data.frame(x=x.new))
Upvotes: 8
Reputation: 263481
You are sending individual vectors to lm
. If you want to see what is going wrong here, then type:
fit.temp$terms
... and notice that the name of the x-predictor is:
attr(,"term.labels")
[1] "ns(x, knots = seq(0.01, 2, by = 0.1))"
You would need to give predict
a list with that as a name for x
. Much easier would be to use lm
and lm.predict
with a dataframe argument so that the predictions could be done with internal re-evaluation of the new values.
df <- data.frame(x,y)
# My n.s fit:
fit.temp <- lm(y ~ ns(x, knots = seq(0.01, 2, by = 0.1)) , data=df)
predict(fit.temp, newdata=list(x =c(0.275, 0.375, 0.475, 0.575, 1.345) ) )
# 1 2 3 4 5
#0.9264572 1.6549046 2.0743470 1.9507962 0.8220687
points(x.new, predict(fit.temp,
newdata=list(x =c(0.275, 0.375, 0.475, 0.575, 1.345) )),
col="red", cex=2)
Upvotes: 1