Reputation: 185
say I have a data table like the following:
test <- data.table(a1=1:3,b1=2:4,a2=5:7,b2=7:9)
test
# a1 b1 a2 b2
# 1: 1 2 5 7
# 2: 2 3 6 8
# 3: 3 4 7 9
What I want is to replace value which are > 2 to 99, but only for columns start with 'b'. I should get
test
# a1 b1 a2 b2
# 1: 1 2 5 99
# 2: 2 99 6 99
# 3: 3 99 7 99
Any fast way of doing this?
Upvotes: 3
Views: 361
Reputation: 887118
We can use grep
to get the column names that start with 'b' ('nm1'). Specify the 'nm1' in .SDcols
, loop through the columns (.SD
), use replace
to change the values that are greater than 2 to 99, and assign (:=
) the output back to the respective columns.
nm1 <- grep('^b', names(test), value = TRUE)
test[, (nm1) := lapply(.SD, function(x)
replace(x, x>2, 99)) , .SDcols= nm1]
A faster option would be to use a for
loop with set
. It would be more efficient as the overhead of [.data.table
is avoided.
for(j in nm1){
set(test, i= which(test[[j]] >2), j=j, value=99)
}
Upvotes: 8