R Yoda
R Yoda

Reputation: 8770

How to protect/encrypt R objects in RData files due to EU-GDPR

I want to protect the content of my RData files with a strong encryption algorithm since they may contain sensitive personal data which must not be disclosed due to (legal) EU-GDPR requirements.

How can I do this from within R?

I want to avoid a second manual step to encrypt the RData files after creating them to minimize the risk of forgetting it or overlooking any RData files.

I am working with Windows in this scenario...

Upvotes: 5

Views: 2677

Answers (3)

trope
trope

Reputation: 61

Using hrbrmstr answer I made a simple code snippet of two functions: saveRDSEnc and readRDSEnc.

If object's size is big it is much better to save the object first, load saved content as raw object, encrypt it and then save encrypted content without compression. Code below is using this fact.

library(openssl)

###
#' Serialization Interface for Single Objects with encryption
#' 
#' @details Function to write a single R object to a file with encryption using
#'  symmetric AES encryption
#'
#' @param ... arguments passed to saveRDS function
#' @param password Encryption password
#'
#' @return NULL
#' @export
#' 
#' @example
#' x <- "Hello world!"
#' saveRDSEnc(x, file='test.rds', compress='xz', password='1234')
###
saveRDSEnc <- function(..., password) {
  stopifnot("Missing password!" = !missing(password))
  
  args <- list(...)
  key <- openssl::sha256(charToRaw(as.character(password)))
  saveRDS(...)
  
  x <- readBin(con = args$file, what = raw(), n = file.size(args$file))
  x <- openssl::aes_cbc_encrypt(data = x, key = key)
  saveRDS(object = x, file = args$file, compress = FALSE)
  
  invisible(NULL)
}

###
#' Serialization Interface for Single Objects with encryption
#'
#' @details Function to read a single R object from an ecrypted file using
#'  symmetric AES decryption.
#'
#' @param ... arguments passed to readRDS function
#' @param password Decryption password 
#'
#' @return Restored object
#' @export
#' 
#' @example
#' x <- readRDSEnc('test.rds', password='1234')
#' print(x) # Hello world!
###
readRDSEnc <- function(..., password) {
  stopifnot("Missing password!" = !missing(password))
  
  args <- list(...)
  key <- openssl::sha256(charToRaw(as.character(password)))
  tmpf <- tempfile()
  
  tryCatch({
    x <- readRDS(...)
    x <- openssl::aes_cbc_decrypt(data = x, key = key)
    writeBin(object = x, con = tmpf)
    args$file <- tmpf
    x <- do.call(readRDS, args)
  }, finally = unlink(tmpf))

  x
}

Upvotes: 1

Revanth Nemani
Revanth Nemani

Reputation: 171

I know it's very late but checkout this package endecrypt

Installation :

devtools::install_github("RevanthNemani\endecrypt")

Use the following functions for column encryption:

airquality <- EncryptDf(x = airquality, pub.key = pubkey, encryption.type = "aes256")

For column decryption:

airquality <- DecryptDf(x = airquality, prv.key = prvkey, encryption.type = "aes256")

Checkout this Github page

Just remember to generate your keys and save it for first use. Load the keys when required and supply the key object to the functions.

Eg

SaveGenKey(bits = 2048,
              private.key.path = "Encription/private.pem",
              public.key.path = "Encription/public.pem")

# Load keys already stored using this function 
prvkey <- LoadKey(key.path = "Encription/private.pem", Private = T)

It is very easy to use and your dataframes can be stored in a database or Rdata file.

Upvotes: 1

hrbrmstr
hrbrmstr

Reputation: 78832

library(openssl)

x <- serialize(list(1,2,3), NULL)

passphrase <- charToRaw("This is super secret")
key <- sha256(passphrase)

encrypted_x <- aes_cbc_encrypt(x, key = key)

saveRDS(encrypted_x, "secret-x.rds")

encrypted_y <- readRDS("secret-x.rds")

y <- unserialize(aes_cbc_decrypt(encrypted_y, key = key))

You need to deal with secrets management (i.e. the key) but this general idiom should work (with a tad more bulletproofing).

Upvotes: 13

Related Questions