Karan Saxena
Karan Saxena

Reputation: 45

R: GLM function error with simulated data

I have two vectors of simulated data as follows:

x = rnorm(1000, mean  = 0, sd = 1)

eps = rnorm(1000, mean = 0, sd = sqrt(0.25))

I am trying to use boot library's glm and cv.glm function to fit a linear regression model and multiple linear regression model with either leave one out cross-validation or k-fold cross-validation. The piece of code that I am using with the error I am getting is as follows:

> glm.fit=glm(y~x)
> cv.err=cv.glm(x, glm.fit)
Error in if ((K > n) || (K <= 1)) stop("'K' outside allowable range") : 
  missing value where TRUE/FALSE needed

I did check using is.na(x) and confirmed that there are no null values present. Could anyone please suggest a solution for this or point out what am I doing wrong?

Thanks in advance.

Upvotes: 1

Views: 346

Answers (1)

StupidWolf
StupidWolf

Reputation: 46968

For glm() you can get x and y from the environment, but for cv.glm it has no access to these objects because it is running under another environment. Maybe check this post or this book chapter

If I run your code I get the same error:

library(boot)
set.seed(111)
x = rnorm(1000, mean  = 0, sd = 1)
y = rnorm(1000, mean = 0, sd = sqrt(0.25))
glm.fit=glm(y~x)
cv.err=cv.glm(x, glm.fit)
Error in if ((K > n) || (K <= 1)) stop("'K' outside allowable range") : 
  missing value where TRUE/FALSE needed

If I put them into a data.frame it will work:

da = data.frame(x=x,y=y)
glm.fit=glm(y~x)
cv.err=cv.glm(da, glm.fit,K=5)
cv.err$delta
[1] 0.2428287 0.2426424

Upvotes: 0

Related Questions