Reputation: 2481
The problem is illustrated using the code below. If you run it you will see that lm handles the predict gracefully while gls fails to do so. This is most likely a problem in predict.gls but I don't understand why. This is only an issue when using the factor call. Without it everything works out fine. I am fairly confident that predict.gls fails because all levels are not present in the new dataset. However, lm works it out. To me it feels like a bug but I'm not proficient enough with the gls code to determine it.
library(nlme)
# lm example
myfit<-lm(mpg~factor(cyl):disp+hp, data=mtcars)
mypred<-predict(myfit, mtcars[1:3, 1:7])
# gls example
myfit2<-gls(mpg~factor(cyl):disp+hp, data=mtcars)
mypred2<-predict(myfit2, mtcars[1:3, 1:7])
It fails with the error:
# Error in X[, names(cf), drop = FALSE] : subscript out of bounds
Any ideas?
My R.version output:
platform x86_64-pc-linux-gnu
arch x86_64
os linux-gnu
system x86_64, linux-gnu
status
major 3
minor 0.2
year 2013
month 09
day 25
svn rev 63987
language R
version.string R version 3.0.2 (2013-09-25)
nickname Frisbee Sailing
nlme package version: "package ‘nlme’ version 3.1-113"
Upvotes: 1
Views: 2268
Reputation: 173547
Since I'm not the author of predict.gls
I can't answer precisely why it was written this way, but I'm hesitant to suggest that this is a bug in a function/package that have been around this long.
Anyway...the reason it works with lm
is that when predict.lm
calls model.frame
:
m <- model.frame(Terms, newdata, na.action = na.action,
xlev = object$xlevels)
it is using the xlev
argument and the information on the levels from the fitted model object itself. I don't believe that gls
object stores this information at all.
In predict.gls
there are two things going on. First, the initial call to model.frame
:
mfArgs <- list(formula = form, data = newdata, na.action = na.action)
mfArgs$drop.unused.levels <- TRUE
dataMod <- do.call("model.frame", mfArgs)
Note that here we're not using the xlev
argument, and in fact we're explicitly saying to drop unused levels. If you create your own version of predict.gls
and comment our the drop.unused.levels
line, it should work, as long as you don't call factor
in your formula!
Why?
Because later on we see this:
X <- model.matrix(form, dataMod)
where your formula is being re-applied. Which means that factor
is being called on columns with levels that don't exist, forcing them to be dropped.
So when I use a version of predict.gls
that comments out the drop.unused.levels
line, and I set cyl
to be a factor up front in the data frame, it generates predictions for me just fine.
But I would suggest asking the package authors as to whether this is intended behavior or not. It seems odd to me, but like I said, for a package this old and well used it seems strange that something like this would be un-intended.
Upvotes: 2