Reputation: 339
I have a data.table and want to take a linear combination of the columns. How should I do it?
The setup
require(data.table)
set.seed(1)
DT <- data.table(A = rnorm(10),
B = rnorm(10),
C = rnorm(10),
D = rnorm(10),
coefA = rnorm(10),
coefB = rnorm(10),
coefC = rnorm(10),
coefD = rnorm(10))
I can do the following:
DT[, sum := A*coefA + B * coefB + C * coefC + D * coefD]
Is there a better way to solve this?
Upvotes: 1
Views: 1274
Reputation: 3240
Assuming you're needing a better method because you may not always have 4 of each, the following will work as long as the ordering is correct for adding E,F,G;coefE,coefF,coefG...
coefcols <- names(DT)[grepl("coef", names(DT))]
valucols <- names(DT)[!grepl("coef", names(DT))]
DT[, sum := apply(DT[, ..valucols] * DT[, ..coefcols], 1, sum)]
Edit: After reading @lmo's comment, I realized that the last line can be simplified using rowSums
:
DT[, sum := rowSums(DT[, ..valucols] * DT[, ..coefcols])]
Upvotes: 0
Reputation: 887691
One option is
DT[ sum := Reduce(`+`, DT[, 1:4] * DT[, 5:8])]
Or using .SD
DT[, sum := Reduce(`+`, .SD[, 1:4] * .SD[, 5:8])]
Or we can do
nm1 <- names(DT)[1:4]
nm2 <- paste0("coef", nm1)
DT[, sum := Reduce(`+`, Map(`*`, mget(nm1), mget(nm2)))]
Upvotes: 3