Reputation: 2187
When I tried to read in a big file of actual size 672MB into R, it turns out that the system memory usage exploded from 0.98 G to 3.6 G (I'm using a 4 GB memory desktop). Which means it takes several times of space to store the file into memory and I can do nothing calculation after I read in as lack of memory. Is that normal?
The code I've used: a=read.table(file.choose(),header=T,colClasses="integer",nrows=16777777,comment.char="",sep="\t")
The file contains 167772XX lines.
gc() before and after I run
not sure what does this mean.
Upvotes: 1
Views: 130
Reputation: 176648
Your text file is 672MB. Assuming all your integers are 1 digit, it's perfectly reasonable that your R object is about 2*672MB.
Each character in a text file is 1 byte. R stores integers in 4 bytes (see ?integer
). That means your file contains ~336MB of "\t"
and ~336MB of integers stored as 1-byte characters.
R reads those 1-byte characters, stores them as 4-byte integers and... 336*4 = 1344MB. The second row and second column of your gc
output reads 1345.6, which equals 1344MB + the original 1.6MB.
Upvotes: 6