Krishnan
Krishnan

Reputation: 1319

R data.table Size and Memory Limits

I have a 15.4GB R data.table object with 29 Million records and 135 variables. My system & R info are as follows:

Windows 7 x64 on a x86_64 machine with 16GB RAM."R version 3.1.1 (2014-07-10)" on "x86_64-w64-mingw32" 

I get the following memory allocation error (see image)

enter image description here

I set my memory limits as follows:

#memory.limit(size=7000000)
#Change memory.limit to 40GB when using ff library
memory.limit(size=40000)

My questions are the following:

  1. Should I change the memory limit to 7 TB
  2. Break the file into chunks and do the process?
  3. Any other suggestions?

Upvotes: 7

Views: 7424

Answers (2)

R Yoda
R Yoda

Reputation: 8780

Try to profile your code to identify which statements cause the "waste of RAM":

# install.packages("pryr")
library(pryr) # for memory debugging

memory.size(max = TRUE) # print max memory used so far (works only with MS Windows!)
mem_used()
gc(verbose=TRUE) # show internal memory stuff (see help for more)

# start profiling your code
Rprof( pfile <- "rprof.log", memory.profiling=TRUE) # uncomment to profile the memory consumption

# !!! Your code goes here

# Print memory statistics within your code whereever you think it is sensible
memory.size(max = TRUE)
mem_used()
gc(verbose=TRUE)

# stop profiling your code
Rprof(NULL)
summaryRprof(pfile,memory="both") # show the memory consumption profile

Then evaluate the memory consumption profile...

Since your code stops with an "out of memory" exception you should reduce the input data to an amount the makes your code workable and use this input for memory profiling...

Upvotes: 8

Stereo
Stereo

Reputation: 1193

You could try to use the ff package. It works well with on disk data.

Upvotes: 0

Related Questions