Reputation: 85
I was trying to run the below code to mask the data in 2 columns, but failing with below error:
setwd("/cloud/project/CX")
Credit_tbl <-read.csv(file = 'Sample_data.csv',sep = ",",stringsAsFactors = FALSE)
anonymize <- function(x, algo="crc32"){
unq_hashes <- vapply(unique(x), function(object) digest(object, algo=algo), FUN.VALUE="", USE.NAMES=TRUE)
unname(unq_hashes[x])
}
cols_to_mask <- c("Email","Phone")
Credit_tbl[,cols_to_mask := lapply(.SD, anonymize),.SDcols=cols_to_mask,with=FALSE]
Error:
Error in
[.data.frame
(Credit_tbl, ,:=
(cols_to_mask, lapply(.SD, anonymize)), : unused arguments (.SDcols = cols_to_mask, with = FALSE)
Upvotes: 2
Views: 2219
Reputation: 389235
You have a dataframe and you are using data.table
syntax.
Convert dataframe to data.table
and apply the function.
library(data.table)
library(digest)
cols_to_mask <- c("Email","Phone")
anonymize <- function(x, algo="crc32") {
sapply(x, function(y) if(y == "" | is.na(y)) "" else digest(y, algo = algo))
}
setDT(Credit_tbl)
Credit_tbl[, (cols_to_mask) := lapply(.SD, anonymize), .SDcols = cols_to_mask]
Without changing to data.table
you can apply the function using lapply
:
Credit_tbl[cols_to_mask] <- lapply(Credit_tbl[cols_to_mask], anonymize)
Upvotes: 1