Reputation: 45
I'm getting NA as my result. What am I doing wrong?
data(Boston, package='MASS')
x <- Boston$dis
y <- Boston$nox
n <- length(x)
cvs <- rep(0, n)
for(i in 1:n){
xi <- x[-i]
yi <- y[-i]
d <- loess(yi~xi, span=0.2, degree=2)
cvs[i] <- (y[i] - predict(d, newdata=data.frame(xi=x[i])))^2
}
mean(cvs)
Upvotes: 2
Views: 1514
Reputation: 173
You need to set loess.control(surface = "direct")
in order to extrapolate.
You also may want to find a faster way to do this, since this requires a lot of fitting.
Upvotes: 0
Reputation: 263352
mean(cvs,na.rm=TRUE)
[1] 0.003753745
plot(y~x)
lines( d$x[order(d$x)], d$fitted[order(d$x)])
which(is.na(d$fitted[order(d$x)]) )
#integer(0)
You can see that the NA's come at the extremes of the x-range:
#add as debugging code
if(is.na(cvs[i]) ) {print(i);print(x[i])}}
[1] 354
[1] 12.1265
[1] 373
[1] 1.1296
range(x)
#[1] 1.1296 12.1265
But I still don't understand why:
(y[ which(is.na(cvs)) ]-predict(d, x[ which(is.na(cvs)) ] ))^2
[1] 3.218139e-06 3.742504e-04
Upvotes: 2
Reputation: 25454
Aggregate functions such as mean
do not allow NA
values by default, but you have some in the result vector:
which(is.na(cvs))
## [1] 354 373
Not sure where they come from, but you could make the mean
function accept NA
values by passing na.rm=TRUE
.
Upvotes: 1