HOSS_JFL
HOSS_JFL

Reputation: 839

R: reducing digits/precision for saving RAM?

I am running out of RAM in R with a data.table that contains ~100M rows and 40 columns full of doubles. My naive thought was that I could reduce the object size of the data table by reducing the precision. There is no need for 15 digits after the comma. I played around by rounding, but as we know

round(1.68789451154844878,3)

gives

 1.6879999999999999

and does not help. Therefore, I transformed the values to integers. However, as the small examples below show for a numeric vector, there is only a 50% reduction from 8000040 bytes to 4000040 bytes and this reduction does not increase any more when reducing the precision further.

Is there a better way to do that?

set.seed(1)
options(digits=22)

a1 = rnorm(10^6)
a2 = as.integer(1000000*(a1)) 
a3 = as.integer(100000*(a1)) 
a4 = as.integer(10000*(a1)) 
a5 = as.integer(1000*(a1)) 

head(a1)
head(a2)
head(a3)
head(a4)
head(a5)

give

[1] -0.62645381074233242  0.18364332422208224 -0.83562861241004716  1.59528080213779155  0.32950777181536051 -0.82046838411801526
[1] -626453  183643 -835628 1595280  329507 -820468
[1] -62645  18364 -83562 159528  32950 -82046
[1] -6264  1836 -8356 15952  3295 -8204
[1] -626  183 -835 1595  329 -820

and

object.size(a1)
object.size(a2)
object.size(a3)
object.size(a4)
object.size(a5)

give

8000040 bytes
4000040 bytes
4000040 bytes
4000040 bytes
4000040 bytes

Upvotes: 6

Views: 530

Answers (1)

Avraham
Avraham

Reputation: 1719

Not as such, no. In R, an integer takes 4 bytes and a double takes 8. If you are allocating space for 1M integers you perforce are going to need 4M bytes of RAM for the vector of results.

Upvotes: 1

Related Questions