FairyOnIce
FairyOnIce

Reputation: 2614

How to save a VERY LARGE .rda file in R package

I am eager to save two 460 x 5000 numeric matrices into my R-package. Following the instructions in: How to effectively deal with uncompressed saves during package check? I saved the objects as:

save(mat1,file="mat1.rda",compress="xz")
save(mat2,file="mat2.rda",compress="xz")

However, the resulting R-objects are quite large (8.7MB and 8.9 MB) and the R CMD CHECK --as-cran gives me the notes:

 * checking installed package size ... NOTE
   installed size is 20.1Mb
   sub-directories of 1Mb or more:
   data  20.0Mb

In my understanding, one cannot submit R packages to CRAN which does not "pass" (i.e., no Note nor warning) R CMD CHECL --as-cran. Is there way to compress the dataset even smaller?

Upvotes: 5

Views: 2584

Answers (2)

krlmlr
krlmlr

Reputation: 25444

Consider distributing the data in a separate data package that will be built, uploaded and installed only once (hopefully). Compare this to the efforts required to retransfer the same data over and over again as you update your package.

(Of course, this applies only if you intend to supply updates to your package. There's no difference if your code is perfect right from the start ;-) )

Upvotes: 1

Paul Hiemstra
Paul Hiemstra

Reputation: 60924

Is it really necessary to include those files? I see several options:

  • Include a smaller subset of the matrix, which you use in the examples.
  • Generate a matrix on-the-fly, e.g. with random numbers.
  • Put the files somewhere for download, and ensure that the examples do not execute.

Upvotes: 6

Related Questions