Reputation: 37
I have a melted data table with a column containing values that refer to other column names within the same table. I want to replace each row within that same column with the row value of the referenced column.
library("data.table")
## Example input data table
DT_input <- data.table(A=c(1:10),
B=c(11:20),
C=c(21:30),
replace=c(rep("A", 5), rep("B", 3), rep("C", 2)))
## Desired output data table
DT_output <- data.table(A=c(1:10),
B=c(11:20),
C=c(21:30),
replace=c(1:5, 16:18, 29:30))
My old approach shown here is very slow because of the for loop:
## Attempted looping solution
for (kRow in seq_len(nrow(DT_input))) {
e <- parse(text = DT_input[kRow, Variable])
DT_input[kRow, Variable := eval(e)]
}
Upvotes: 1
Views: 529
Reputation: 25225
An option using data.table
:
DT_input[, rn := .I]
DT_input[, replace :=
DT_input[, DT_input[.SD, on=c("rn", .BY$replace), get(.BY$replace)], .(replace)]$V1
]
output:
A B C replace
1: 1 11 21 1
2: 2 12 22 2
3: 3 13 23 3
4: 4 14 24 4
5: 5 15 25 5
6: 6 16 26 16
7: 7 17 27 17
8: 8 18 28 18
9: 9 19 29 29
10: 10 20 30 30
It will be slower than Akrun base R method.
Upvotes: 1
Reputation: 886938
If we need a vectorized approach use the row/column
indexing from base R
i1 <- cbind(seq_len(nrow(df1)), match(df1$replace, names(df1)[-4]))
df1$replace <- df1[-4][i1]
df1$replace
#[1] 1 2 3 4 5 16 17 18 29 30
With data.table
, an option is Map
or for
loop without the eval
, but it would be still not vectorized
df1 <- as.data.frame(DT_input)
Upvotes: 1