wenduowang
wenduowang

Reputation: 37

R data.table one line statement dealing with confusing variable name

Suppose dt is a data.table object with columns A, B and C.

I want loop over the columns to filter out some rows, AND then apply a function on that column:

for(col in c("A", "B", "C")){
  dt[col %in% some_filter[[col], col := some_function(col), with=FALSE]
}

Where some_filter is a list containing some valid values, for example some_filter[["A"]] = c("just", "an", "example") , etc.

However by referring to col in those 4 positions, data.table seems to mess up the namespace and fail miserably.

There is a work around via temporary variables, but how to do this task in one line?

A not working code is:

library(data.table)
library(dplyr)
dt <- data.table(A=1:10, B=11:20, C=21:30)
f <- list()
f[["A"]] <- 3:5
f[["B"]] <- 14:18
f[["C"]] <- 28:29
for(col in colnames(dt)){
  dt[col %in% f[[col]], col := col * 2, with=F] # Double up some rows
}

Upvotes: 0

Views: 106

Answers (2)

akrun
akrun

Reputation: 886928

Another option would be to use set

for(nm1 in names(dt)) {
   i1 <- which(dt[[nm1]] %in% f[[nm1]])
   set(dt, i= i1, j = nm1, value = dt[[nm1]][i1]*2L)
 }
dt
#     A  B  C
# 1:  1 11 21
# 2:  2 12 22
# 3:  6 13 23
# 4:  8 28 24
# 5: 10 30 25
# 6:  6 32 26
# 7:  7 34 27
# 8:  8 36 56
# 9:  9 19 58
#10: 10 20 30

Upvotes: 1

dww
dww

Reputation: 31452

We can use get to access columns from a character variable containing their names. () around the LHS of := is also preferred to using with = F

for(col in colnames(dt)){
  dt[get(col) %in% f[[col]], (col) := get(col) * 2L] # Double up some rows
}

#     A  B  C
# 1:  1 11 21
# 2:  2 12 22
# 3:  6 13 23
# 4:  8 28 24
# 5: 10 30 25
# 6:  6 32 26
# 7:  7 34 27
# 8:  8 36 56
# 9:  9 19 58
# 10: 10 20 30

Upvotes: 3

Related Questions