Reputation: 73342
lm.wfit
(and .lm.fit
) has considerable speed advantages over lm
if we can do without the additional calculations of the latter.
fo <- mpg ~ hp
X <- model.matrix(fo, mtcars)
microbenchmark::microbenchmark(flm=lm <-
lm(fo, mtcars[mtcars$am == 1, ],
weights=rep(1, 13)),
flmw=lmw <-
lm.wfit(X[mtcars$am == 1, ], mtcars$mpg[mtcars$am == 1],
w=rep(1, 13)))
# Unit: microseconds
# expr min lq mean median uq max neval cld
# flm 935.328 951.2735 994.7719 957.2695 1006.128 1333.339 100 b
# flmw 56.895 60.3400 66.2028 64.2940 66.463 186.250 100 a
stopifnot(lm$coefficients == lmw$coefficients)
It's often gone well, but now I need to apply predict
with the whole data, similar to this example.
predict(lm, mtcars)
# [...]
predict(lmw, mtcars)
# Error in UseMethod("predict") :
# no applicable method for 'predict' applied to an object of class "list"
Is that even possible after using this confined lm.wfit
? If so, how?
Upvotes: 2
Views: 117
Reputation: 5273
lm
is slower than lm.fit
and lm.wfit
because it uses those functions internally. You've pulled some other internal work out by creating the model matrix outside the benchmark. That's fine if you expect to reuse the model matrix, but otherwise it's a misleading benchmark.
As for predict
, it's a generic method. Because there's no predict.list
method, it fails. If you want, you can write a method for a custom class, then assign that class to the returned value of lm.wfit
.
E.g.:
my_lm_wfit <- function(...) {
fit <- lm.wfit(...)
class(fit) <- "my_lm_wfit"
fit
}
predict.my_lm_wfit <- function(...) {
# Do something...
}
Upvotes: 2
Reputation: 206411
Well, the help page for ?lm.wfit
does warn against its use and that's probably for this exact reason. It doesn't store any of the information about formulas and column names that predict()
normally uses to make sure your new data matches the variables types of the old data.
Instead, you could do some of the matrix multiplication yourself if you go through the same model.matrix()
procedure.
model.matrix(fo, mtcars) %*% lmw$coefficients
You can see these are the same as the base predict with
all((model.matrix(fo, mtcars) %*% lmw$coefficients) == predict(lm, mtcars))
# [1] TRUE
Upvotes: 3