How to subset certain column and change value of a data table

Question

say I have a data table like the following:

test <- data.table(a1=1:3,b1=2:4,a2=5:7,b2=7:9)
test
#    a1 b1 a2 b2
# 1:  1  2  5  7
# 2:  2  3  6  8
# 3:  3  4  7  9

What I want is to replace value which are > 2 to 99, but only for columns start with 'b'. I should get

test
#    a1  b1 a2  b2
# 1:  1   2  5  99
# 2:  2  99  6  99
# 3:  3  99  7  99

Any fast way of doing this?

akrun · Accepted Answer

We can use grep to get the column names that start with 'b' ('nm1'). Specify the 'nm1' in .SDcols, loop through the columns (.SD), use replace to change the values that are greater than 2 to 99, and assign (:=) the output back to the respective columns.

nm1 <- grep('^b', names(test), value = TRUE)
test[, (nm1) := lapply(.SD, function(x) 
          replace(x, x>2, 99)) , .SDcols= nm1]

A faster option would be to use a for loop with set. It would be more efficient as the overhead of [.data.table is avoided.

for(j in nm1){
  set(test, i= which(test[[j]] >2), j=j, value=99)
}

How to subset certain column and change value of a data table

Answers (1)

Related Questions