Han Xu
Han Xu

Reputation: 185

How to subset certain column and change value of a data table

say I have a data table like the following:

test <- data.table(a1=1:3,b1=2:4,a2=5:7,b2=7:9)
test
#    a1 b1 a2 b2
# 1:  1  2  5  7
# 2:  2  3  6  8
# 3:  3  4  7  9

What I want is to replace value which are > 2 to 99, but only for columns start with 'b'. I should get

test
#    a1  b1 a2  b2
# 1:  1   2  5  99
# 2:  2  99  6  99
# 3:  3  99  7  99

Any fast way of doing this?

Upvotes: 3

Views: 361

Answers (1)

akrun
akrun

Reputation: 887118

We can use grep to get the column names that start with 'b' ('nm1'). Specify the 'nm1' in .SDcols, loop through the columns (.SD), use replace to change the values that are greater than 2 to 99, and assign (:=) the output back to the respective columns.

nm1 <- grep('^b', names(test), value = TRUE)
test[, (nm1) := lapply(.SD, function(x) 
          replace(x, x>2, 99)) , .SDcols= nm1]

A faster option would be to use a for loop with set. It would be more efficient as the overhead of [.data.table is avoided.

for(j in nm1){
  set(test, i= which(test[[j]] >2), j=j, value=99)
}

Upvotes: 8

Related Questions