Reputation: 37
Suppose dt
is a data.table
object with columns A
, B
and C
.
I want loop over the columns to filter out some rows, AND then apply a function on that column:
for(col in c("A", "B", "C")){
dt[col %in% some_filter[[col], col := some_function(col), with=FALSE]
}
Where some_filter
is a list
containing some valid values, for example some_filter[["A"]] = c("just", "an", "example")
, etc.
However by referring to col
in those 4 positions, data.table
seems to mess up the namespace and fail miserably.
There is a work around via temporary variables, but how to do this task in one line?
A not working code is:
library(data.table)
library(dplyr)
dt <- data.table(A=1:10, B=11:20, C=21:30)
f <- list()
f[["A"]] <- 3:5
f[["B"]] <- 14:18
f[["C"]] <- 28:29
for(col in colnames(dt)){
dt[col %in% f[[col]], col := col * 2, with=F] # Double up some rows
}
Upvotes: 0
Views: 106
Reputation: 886928
Another option would be to use set
for(nm1 in names(dt)) {
i1 <- which(dt[[nm1]] %in% f[[nm1]])
set(dt, i= i1, j = nm1, value = dt[[nm1]][i1]*2L)
}
dt
# A B C
# 1: 1 11 21
# 2: 2 12 22
# 3: 6 13 23
# 4: 8 28 24
# 5: 10 30 25
# 6: 6 32 26
# 7: 7 34 27
# 8: 8 36 56
# 9: 9 19 58
#10: 10 20 30
Upvotes: 1
Reputation: 31452
We can use get
to access columns from a character variable containing their names. ()
around the LHS of :=
is also preferred to using with = F
for(col in colnames(dt)){
dt[get(col) %in% f[[col]], (col) := get(col) * 2L] # Double up some rows
}
# A B C
# 1: 1 11 21
# 2: 2 12 22
# 3: 6 13 23
# 4: 8 28 24
# 5: 10 30 25
# 6: 6 32 26
# 7: 7 34 27
# 8: 8 36 56
# 9: 9 19 58
# 10: 10 20 30
Upvotes: 3