Indicator
Indicator

Reputation: 361

In R, is there any way to share a variable between difference processes of R in the same machine?

My problem is that I have a large model, which is slow to load to memory. To test it on many samples, I need to run some C program to generating input features for model, then run R script to predict. It takes too much time to load the model every time.

So I am wondering

1) if there is some method to keep the model ( a variable in R) in the memory.

or

2) Can I run a separative process of R as a dedicated server, then all the prediction processes of R can access the variable in the server on the same machine.

The model is never changed during for all the prediction. It is a randomForest model stored in a .rdata file, which has ~500MB. Loading this model is slow.

I know that I can use parallel R (snow, doPar, etc) to perform prediction in parallel, however, this is not what I want, since it require me to change the data flow I used.

Thanks a lot.

Upvotes: 3

Views: 646

Answers (1)

Ricardo Saporta
Ricardo Saporta

Reputation: 55350

If you are regenerating the model every time, you can save the model as an RData file and then share it across the different machines. While it may still take time to load from disk to memory, it will save the time of regenerating.

   save(myModel, file="path/to/file.Rda")

   # then
   load(file="path/to/file.Rda")

Edit per @VictorK's suggetsion: As Victor points out, since you are saving only a single object, saveRDS may be a better choice.

  saveRDS(myModel, file="path/to/file.Rds")

  myModel <- readRDS(file="path/to/file.Rds")

Upvotes: 2

Related Questions