Reputation: 105
Found a similar question here, but it is not full.
My question is split in 2 :
By "slim" I mean with just the right amount of data that the predict() function won't fail. I want to store the model becuase learning sometimes takes a lot of time, for example :
lmFull <- lm(Volume~Girth+Height,data=trees)
lmSlim <- lmFull
lmSlim$fitted.values <- lmSlim$qr$qr <- lmSlim$residuals <- lmSlim$model <- lmSlim$effects <- NULL
pred1 <- predict(lmFull,newdata=data.frame(Girth=c(1,2,3),Height=c(2,3,4)))
pred2 <- predict(lmSlim,newdata=data.frame(Girth=c(1,2,3),Height=c(2,3,4)))
identical(pred1,pred2)
[1] True
What I have done to store as text is take the lmSlim object and deparse it :
lmTxt <- deparse(lmSlim)
lmTxt <- paste0(lmTxt,collapse="")
Storing this in the the DB is easy, but when I want to reuse it again :
lmRst <- eval(parse(text=lmTxt))
class(lmRst)
[1] "lm"
predict(lmRst,newdata=data.frame(Girth=c(1,2,3),Height=c(2,3,4)))
Error in eval(expr, envir, enclos) : object 'Volume' not found
Any suggestions?
Upvotes: 1
Views: 490
Reputation: 105
I've solved the issue, might be a bit of a workaround but it works :
# learning and reducing the size of output
lmFull <- lm(Volume~Girth+Height,data=trees)
lmSlim <- lmFull
lmSlim$fitted.values <- lmSlim$qr$qr <- lmSlim$residuals <- lmSlim$model <- lmSlim$effects <- NULL
pred1 <- predict(lmFull,newdata=data.frame(Girth=c(1,2,3),Height=c(2,3,4)))
pred2 <- predict(lmSlim,newdata=data.frame(Girth=c(1,2,3),Height=c(2,3,4)))
identical(pred1,pred2)
[1] TRUE
# deparse and collapse into a string
lmTxt <- deparse(lmSlim)
lmTxt <- paste0(lmTxt,collapse="")
# re-parsing
lmParsed <- eval(parse(text=lmTxt))
lmParsed$call <- lmFull$call
lmParsed$terms <- lmFull$terms
lmParsed
pred3 <- predict(lmParsed,newdata=data.frame(Girth=c(1,2,3),Height=c(2,3,4)))
identical(pred1,pred3)
[1] FALSE
But...
sum(abs(pred1 - pred3))
[1] 1.634248e-13
as.numeric(object.size(lmParsed) / object.size(lmFull))
[1] 0.3449477
So I can live with it.
Upvotes: 1
Reputation: 66
Try this:
lmTxt <- dput(lmSlim)
lmRst <- eval(lmTxt)
predict(lmRst,newdata=data.frame(Girth=c(1,2,3),Height=c(2,3,4)))
Edit: as pointed out in the comments, dput
does not return a string. So here's another option:
save(lmSlim, file='data.txt', ascii=T)
The contents of the file are ascii so it should be possible to write them to a database. To later reload just use the load
command:
load('data.txt')
Upvotes: 1
Reputation: 132959
Don't store it as text. Try this:
lmFull <- lm(Volume~Girth+Height,data=trees)
lmSlim <- lmFull
lmSlim$residuals <- NULL
lmSlim$effects <- NULL
lmSlim$fitted.values <- NULL
lmSlim$model <- NULL
lmSlim$qr$qr <- NULL
predict(lmSlim)
#works
predict(lmSlim, newdata=data.frame(Girth=30, Height=20))
#works
object.size(lmFull)
#22960 bytes
object.size(lmSlim)
#7920 bytes
Upvotes: 0