morningfin
morningfin

Reputation: 339

How to compute the linear combination of different columns within R data.table

I have a data.table and want to take a linear combination of the columns. How should I do it?

The setup

require(data.table)
set.seed(1)

DT <- data.table(A = rnorm(10),
                 B = rnorm(10),
                 C = rnorm(10),
                 D = rnorm(10),
                 coefA = rnorm(10),
                 coefB = rnorm(10),
                 coefC = rnorm(10),
                 coefD = rnorm(10))

I can do the following:

DT[, sum := A*coefA + B * coefB + C * coefC + D * coefD]

Is there a better way to solve this?

Upvotes: 1

Views: 1274

Answers (3)

Eric Watt
Eric Watt

Reputation: 3240

Assuming you're needing a better method because you may not always have 4 of each, the following will work as long as the ordering is correct for adding E,F,G;coefE,coefF,coefG...

coefcols <- names(DT)[grepl("coef", names(DT))]
valucols <- names(DT)[!grepl("coef", names(DT))]
DT[, sum := apply(DT[, ..valucols] * DT[, ..coefcols], 1, sum)]

Edit: After reading @lmo's comment, I realized that the last line can be simplified using rowSums:

DT[, sum := rowSums(DT[, ..valucols] * DT[, ..coefcols])]

Upvotes: 0

akrun
akrun

Reputation: 887691

One option is

DT[ sum := Reduce(`+`, DT[, 1:4] * DT[, 5:8])]

Or using .SD

DT[, sum := Reduce(`+`, .SD[, 1:4] * .SD[, 5:8])]

Or we can do

nm1 <- names(DT)[1:4]
nm2 <- paste0("coef", nm1)
DT[, sum := Reduce(`+`, Map(`*`, mget(nm1), mget(nm2)))]

Upvotes: 3

Dan
Dan

Reputation: 12084

With dplyr:

DT %>% mutate(sum = A*coefA + B * coefB + C * coefC + D * coefD)

Upvotes: 0

Related Questions