WayneZhou
WayneZhou

Reputation: 1

get value of column when column name is based on the value of another column

I have a data table which contains one million records and I try to create a new column based on month.idx:

dt[, new_col := get(paset0("month_",month.idx)]

and it only works for the first line.

Can anyone help me with this problem? thanks!

Data
    id month_1 month_2 month_3 month_4 month_5 month.idx
1:  x1       1       1       1       0       1         3
2:  x2       0       0       0       1       0         4
3:  x3       1       0       0       0       0         1
4:  x4       0       0       0       0       0         5
5:  x5       1       1       0       0       1         2
6:  x6       0       1       0       1       1         3
7:  x7       0       0       1       1       1         4
8:  x8       0       0       0       0       0         1
9:  x9       0       0       0       0       1         5

results:
    id month_1 month_2 month_3 month_4 month_5 month.idx new_col
1:  x1       1       1       1       0       1         3       1
2:  x2       0       0       0       1       0         4       0
3:  x3       1       0       0       0       0         1       0
4:  x4       0       0       0       0       0         5       0
5:  x5       1       1       0       0       1         2       0
6:  x6       0       1       0       1       1         3       0
7:  x7       0       0       1       1       1         4       1
8:  x8       0       0       0       0       0         1       0
9:  x9       0       0       0       0       1         5       0

expected:
    id month_1 month_2 month_3 month_4 month_5 month.idx new_col
1:  x1       1       1       1       0       1         3       1
2:  x2       0       0       0       1       0         4       1
3:  x3       1       0       0       0       0         1       1
4:  x4       0       0       0       0       0         5       0
5:  x5       1       1       0       0       1         2       1
6:  x6       0       1       0       1       1         3       0
7:  x7       0       0       1       1       1         4       0
8:  x8       0       0       0       0       0         1       0
9:  x9       0       0       0       0       1         5       1

Upvotes: 0

Views: 47

Answers (1)

chinsoon12
chinsoon12

Reputation: 25225

Here are 2 options:

1) Using get row by row taking in Frank’s comment:

DT[, new_col := get(paste0("month_", month.idx)), by= month.idx]

2) Melt and then join to do a lookup

DT[, variable := paste0("month_", month.idx)]
DT[melt(DT, id.vars="id", measure.vars=patterns("^month_")), 
    on=.(id, variable), new_col := value]

Speed is dependent on the number of rows & month columns that you have.

data:

DT <- fread("id month_1 month_2 month_3 month_4 month_5 month.idx
x1       1       1       1       0       1         3
x2       0       0       0       1       0         4
x3       1       0       0       0       0         1
x4       0       0       0       0       0         5
x5       1       1       0       0       1         2
x6       0       1       0       1       1         3
x7       0       0       1       1       1         4
x8       0       0       0       0       0         1")

Upvotes: 2

Related Questions