How to predict after lm.wfit with new data?

Question

lm.wfit (and .lm.fit) has considerable speed advantages over lm if we can do without the additional calculations of the latter.

fo <- mpg ~ hp
X <- model.matrix(fo, mtcars)

microbenchmark::microbenchmark(flm=lm <- 
                                 lm(fo, mtcars[mtcars$am == 1, ], 
                                    weights=rep(1, 13)),
                               flmw=lmw <- 
                                 lm.wfit(X[mtcars$am == 1, ], mtcars$mpg[mtcars$am == 1], 
                                         w=rep(1, 13)))
# Unit: microseconds
# expr     min       lq     mean   median       uq      max neval cld
#  flm 935.328 951.2735 994.7719 957.2695 1006.128 1333.339   100   b
# flmw  56.895  60.3400  66.2028  64.2940   66.463  186.250   100  a 
stopifnot(lm$coefficients == lmw$coefficients)

It's often gone well, but now I need to apply predict with the whole data, similar to this example.

predict(lm, mtcars)
# [...]
predict(lmw, mtcars)
# Error in UseMethod("predict") : 
#   no applicable method for 'predict' applied to an object of class "list"

Is that even possible after using this confined lm.wfit? If so, how?

MrFlick · Accepted Answer

Well, the help page for ?lm.wfit does warn against its use and that's probably for this exact reason. It doesn't store any of the information about formulas and column names that predict() normally uses to make sure your new data matches the variables types of the old data.

Instead, you could do some of the matrix multiplication yourself if you go through the same model.matrix() procedure.

model.matrix(fo, mtcars) %*% lmw$coefficients

You can see these are the same as the base predict with

all((model.matrix(fo, mtcars) %*% lmw$coefficients) == predict(lm, mtcars))
# [1] TRUE

How to predict after lm.wfit with new data?

Answers (2)

Related Questions