Change values of multiple columns based on the value of one column in data.table

Question

Suppose I have a data table, dt1:

dt1 <- data.table(
names = c("A1", "XX", "A2", "XY", "A3", "XZ"),
   A1 = c( 0,    0,    0,    0,    0,    0), 
   A2 = c( 0,    0,    0,    0,    0,    0), 
   A3 = c( 0,    0,    0,    0,    0,    0)
)

I want the new data table like:

dt2 <- data.table(
names = c("A1", "XX", "A2", "XY", "A3", "XZ"),
   A1 = c( 1,    0,    0,    0,    0,    0), 
   A2 = c( 0,    0,    1,    0,    0,    0), 
   A3 = c( 0,    0,    0,    0,    1,    0)
)

i.e, if the row value of the column names is the same as the names of certain column, then the row value of that column is changed to 1.

I can achieve this via the following code:

dt1[names == "A1", "A1" := 1]
dt1[names == "A2", "A2" := 1]
dt1[names == "A3", "A3" := 1]

But I'm wondering whether there is an easier way to do this, especially when the number of columns I want to change is big.

I've tried the following lines, and they are not worked:

cln <- c("A1", "A2", "A3")
dt1[names == (cln), (cln) := 1]

Jaap · Accepted Answer

Using the efficient for(...) set(...) combination of data.table:

for(j in names(dt1)[-1]) {
  set(dt1, dt1[, .I[names == j]], j, value = 1)
}

which gives:

> dt1
   names A1 A2 A3
1:    A1  1  0  0
2:    XX  0  0  0
3:    A2  0  1  0
4:    XY  0  0  0
5:    A3  0  0  1
6:    XZ  0  0  0

Instead of names(dt1)[-1] you can also use setdiff(names(dt1), "names").

Change values of multiple columns based on the value of one column in data.table

Answers (2)

Related Questions