statquant
statquant

Reputation: 14370

How can I access all columns of `i` when using .EACHI in data.table

as per ?data.table when using x[i,j] where i is a data.table, columns of i can be accessed with i.column in j. My question is how can I access ALL columns at once ?

Example:

library(data.table)
set.seed(1L)
DT <- data.table(A=c('a','a','b','b','c','c'),
                 B=rnorm(6L),
                 C=rnorm(6L),
                 key='A')
#   A           B           C
#1: a  0.82122120  0.61982575
#2: a  0.59390132 -0.05612874
#3: b  0.91897737 -0.15579551
#4: b  0.78213630 -1.47075238
#5: c  0.07456498 -0.47815006
#6: c -1.98935170  0.41794156

DT2 <- data.table(A=c('a','b','c'),
                  B=rnorm(3L),
                  C=rnorm(3L),
                  key='A')
#   A          B           C
#1: a  1.3586796 -0.05380504
#2: b -0.1027877 -1.37705956
#3: c  0.3876716 -0.41499456

Now say I want to multiply the column B of DT by the number B of DT2 for each matching value of column A. That is to say

0.82122120 * 1.3586796
0.59390132 * 1.3586796
...

I can do the following, but only because I know the columns are called B and C, and because I an patient enough to write all columns.

How could I do a similar thing with 100 columns for instance ?

DT[DT2,{print(as.matrix(.SD) %*% diag(c(i.B,i.C)))},by=.EACHI]
          [,1]         [,2]
[1,] 1.1157764 -0.033349750
[2,] 0.8069216  0.003020009
            [,1]      [,2]
[1,] -0.09445960 0.2145397
[2,] -0.08039401 2.0253136
            [,1]       [,2]
[1,]  0.02890673  0.1984297
[2,] -0.77121518 -0.1734435

Upvotes: 3

Views: 213

Answers (1)

Uwe
Uwe

Reputation: 42544

Indeed, there seem to be no special symbols like x.SD or i.SD to specify all columns (except the ones used in by = ....

However, mget() can be used as a workaround which returns the expected results:

DT2_cols <- paste0("i.", setdiff(names(DT2), key(DT2)))
DT[DT2,{print(as.matrix(.SD) %*% diag(mget(DT2_cols)))},by=.EACHI]
           [,1]        [,2]
[1,]  0.3891785 -0.02190195
[2,] -0.1140867 -0.03317559
          [,1]         [,2]
[1,]  1.850667 -0.009322052
[2,] -3.533068  0.004944318
           [,1]      [,2]
[1,]  0.3706735 1.4268738
[2,] -0.9229703 0.3679482
Empty data.table (0 rows) of 3 cols: A,B,C

(Note that the numbers differ from OP's because using set.seed(1L) I do get different DT and DT2 than posted by the OP.)

Edit: Feature request on GitHub

I just noticed that this feature already has been requested on GitHub.

Upvotes: 4

Related Questions