mjaniec
mjaniec

Reputation: 1114

In memory data processing in R?: save -> readBin ->?

How can I access the R data originally saved with the SAVE command and later read with readBin?

Let me try to explain:

I have saved some data (mostly matrices and lists) to a file using SAVE command.

Later I have transformed this file (encrypted) and saved it using writeBin.

Since the file is transformed I cannot get the data directly using LOAD but need to do it with readBin and perform opposite transformation in memory.

The problem is, after reading with readBin and transforming, the data are in memory, but I cannot access them as R objects (such as matrices or lists), since they are not recognized as such (there is just singular binary object).

The easiest way would be to use this binary object as connection for LOAD.

Unfortunately, LOAD does not accept in-memory binary connections.

I guess .Internal(loadFromConn2(...)) may be a key to this, but I do not have details of it internal workings.

Is there any way to make R recognize the binary data stored in-memory as binary object as R original objects (matrices, lists, etc.)?

The encryption code I am using is available at: http://pastebin.com/eVfVQYwn

Thanks in advance.

Upvotes: 1

Views: 817

Answers (1)

r2evans
r2evans

Reputation: 161085

(If you aren't interested in learning how to research this type of problem in the future, skip to "Results", far below.)

Long Story ...

Knowing some things about how the R objects are stored with save will inform you on how to retrieve it with load. From help(save):

 save(..., list = character(),
      file = stop("'file' must be specified"),
      ascii = FALSE, version = NULL, envir = parent.frame(),
      compress = !ascii, compression_level,
      eval.promises = TRUE, precheck = TRUE)

The default for compress will be !ascii which means compress will be TRUE, so:

compress: logical or character string specifying whether saving to a named file is to use compression. 'TRUE' corresponds to 'gzip' compression, ...

The key here is that it defaults to 'gzip' compression. From here, let's look at help(load):

'load' ... can read a compressed file (see 'save') directly from a file or from a suitable connection (including a call to 'url').

(Emphasis added by me.) This implies both that it will take a connection (that is not an actual file), and that it "forces" compressed-ness. My typical go-to function for faking file connections is textConnection, though this does not work with binary files, and its help page doesn't provide a reference for binary equivalence. Continued from help(load):

A not-open connection will be opened in mode '"rb"' and closed after use. Any connection other than a 'gzfile' or 'gzcon' connection will be wrapped in 'gzcon' to allow compressed saves to be handled ...

Diving a little tangentially (remember the previous mention of gzip compression?), help(gzcon):

Compressed output will contain embedded NUL bytes, and so 'con' is not permitted to be a 'textConnection' opened with 'open = "w"'. Use a writable 'rawConnection' to compress data into a variable.

Aha! Now we see that there is a function rawConnection which one would (correctly) infer is the binary-mode equivalent of textConnection.

Results (aka "long story short, too late")

Your pastebin code is interesting but unfortunately moot. Reproducible examples make things easier for people considering answering your question.

Your problem statement, restated:

set.seed(1234)
fn <- 'test-mjaniec.Rdata'
(myvar1 <- rnorm(5))
##  [1] -1.2070657  0.2774292  1.0844412 -2.3456977  0.4291247
(myvar2 <- sample(letters, 5))
##  [1] "s" "n" "g" "v" "x"
save(myvar1, myvar2, file=fn)
rm(myvar1, myvar2) ## ls() shows they are no longer available

x.raw <- readBin(fn, what=raw(), n=file.info(fn)$size)
head(x.raw)
## [1] 1f 8b 08 00 00 00
## how to access the data stored in `x.raw`?

The answer:

load(rawConnection(x.raw, open='rb'))

(Confirmation:)

myvar1
##  [1] -1.2070657  0.2774292  1.0844412 -2.3456977  0.4291247
myvar2
##  [1] "s" "n" "g" "v" "x"

(It works with your encryption code, too, by the way.)

Upvotes: 5

Related Questions