sds
sds

Reputation: 60004

How to strip down the glm model?

The object returned by glm contains residuals, fitted values, effects, qr$qr, linear.predictors, weights &c &c which add up to a humongous object (if the input is big enough).

How do I strip it down so that something like predict will still work?

Ideally, I want a function which would return a small function object equivalent to function(x) predict(my_model,data.frame(x=x)); something like as.stepfun for isoreg.

Upvotes: 3

Views: 511

Answers (1)

Noah
Noah

Reputation: 1404

Most of the model components are descriptive, and are not necessary for predict to work. A helper function (HT: R-Bloggers) can be used to remove the fat:

stripGlmLR = function(cm) {
  cm$y = c()
  cm$model = c()

  cm$residuals = c()
  cm$fitted.values = c()
  cm$effects = c()
  cm$qr$qr = c()  
  cm$linear.predictors = c()
  cm$weights = c()
  cm$prior.weights = c()
  cm$data = c()


  cm$family$variance = c()
  cm$family$dev.resids = c()
  cm$family$aic = c()
  cm$family$validmu = c()
  cm$family$simulate = c()
  attr(cm$terms,".Environment") = c()
  attr(cm$formula,".Environment") = c()

  cm
}

Now you can apply it to your model for a 5+ order-of-magnitude reduction in size (in this example):

traindata <- data.frame(x = rnorm(1e6), y = rnorm(1e6))
testdata <- data.frame(x = rnorm(10))

mod1 <- glm(y~x, data= traindata)
mod2 <- stripGlmLR(mod1)

format(object.size(mod1), units = "Kb")
# [1] "492234.5 Kb"
format(object.size(mod2), units = "Kb")
# [1] "18.5 Kb"

all(predict(object = mod1, newdata = testdata) == 
    predict(object = mod2, newdata = testdata))
# [1] TRUE

Note that if you want to be able to use the full suite of glm methods, you will need to retain other components of the model.

Upvotes: 3

Related Questions