Aviad Klein
Aviad Klein

Reputation: 105

How to store an R model as text?

Found a similar question here, but it is not full.

My question is split in 2 :

  1. I want to store a "slim" version of an R lm() object as text in a DBMS.
  2. I want to be able to produce predictions out of the text object I saved.

By "slim" I mean with just the right amount of data that the predict() function won't fail. I want to store the model becuase learning sometimes takes a lot of time, for example :

lmFull <- lm(Volume~Girth+Height,data=trees)
lmSlim <- lmFull
lmSlim$fitted.values <- lmSlim$qr$qr <- lmSlim$residuals <- lmSlim$model <- lmSlim$effects <- NULL
pred1 <- predict(lmFull,newdata=data.frame(Girth=c(1,2,3),Height=c(2,3,4)))
pred2 <- predict(lmSlim,newdata=data.frame(Girth=c(1,2,3),Height=c(2,3,4)))
identical(pred1,pred2)
[1] True

What I have done to store as text is take the lmSlim object and deparse it :

lmTxt <- deparse(lmSlim)
lmTxt <- paste0(lmTxt,collapse="")

Storing this in the the DB is easy, but when I want to reuse it again :

lmRst <- eval(parse(text=lmTxt))
class(lmRst)
[1] "lm"
predict(lmRst,newdata=data.frame(Girth=c(1,2,3),Height=c(2,3,4)))
Error in eval(expr, envir, enclos) : object 'Volume' not found

Any suggestions?

Upvotes: 1

Views: 490

Answers (3)

Aviad Klein
Aviad Klein

Reputation: 105

I've solved the issue, might be a bit of a workaround but it works :

# learning and reducing the size of output
lmFull <- lm(Volume~Girth+Height,data=trees)
lmSlim <- lmFull
lmSlim$fitted.values <- lmSlim$qr$qr <- lmSlim$residuals <- lmSlim$model <- lmSlim$effects <- NULL
pred1 <- predict(lmFull,newdata=data.frame(Girth=c(1,2,3),Height=c(2,3,4)))
pred2 <- predict(lmSlim,newdata=data.frame(Girth=c(1,2,3),Height=c(2,3,4)))
identical(pred1,pred2)
[1] TRUE

# deparse and collapse into a string
lmTxt <- deparse(lmSlim)
lmTxt <- paste0(lmTxt,collapse="")

# re-parsing
lmParsed <- eval(parse(text=lmTxt))
lmParsed$call <- lmFull$call
lmParsed$terms <- lmFull$terms
lmParsed
pred3 <- predict(lmParsed,newdata=data.frame(Girth=c(1,2,3),Height=c(2,3,4)))
identical(pred1,pred3)
[1] FALSE

But...

sum(abs(pred1 - pred3))
[1] 1.634248e-13
as.numeric(object.size(lmParsed) / object.size(lmFull))
[1] 0.3449477

So I can live with it.

Upvotes: 1

jpetterson
jpetterson

Reputation: 66

Try this:

lmTxt <- dput(lmSlim)
lmRst <- eval(lmTxt)
predict(lmRst,newdata=data.frame(Girth=c(1,2,3),Height=c(2,3,4)))

Edit: as pointed out in the comments, dput does not return a string. So here's another option:

save(lmSlim, file='data.txt', ascii=T)

The contents of the file are ascii so it should be possible to write them to a database. To later reload just use the load command:

load('data.txt')

Upvotes: 1

Roland
Roland

Reputation: 132959

Don't store it as text. Try this:

lmFull <- lm(Volume~Girth+Height,data=trees)
lmSlim <- lmFull
lmSlim$residuals <- NULL
lmSlim$effects <- NULL
lmSlim$fitted.values <- NULL
lmSlim$model <- NULL
lmSlim$qr$qr <- NULL
predict(lmSlim)
#works
predict(lmSlim, newdata=data.frame(Girth=30, Height=20))
#works

object.size(lmFull)
#22960 bytes
object.size(lmSlim)
#7920 bytes

Upvotes: 0

Related Questions