Andrew Dempsey
Andrew Dempsey

Reputation: 190

do.call to build and execute data.table commands

I have a small data.table representing one record per test cell (AB testing results) and am wanting to add several more columns that compare each test cell, against each other test cell. In other words, the number of columns I want to add, will depend upon how many test cells are in the AB test in question.

My data.table looks like:

Group   Delta     SD.diff
Control     0           0
Cell1 0.00200 0.001096139
Cell2 0.00196 0.001095797
Cell3 0.00210 0.001096992
Cell4 0.00160 0.001092716

And I want to add the following columns (numbers are trash here):

Group v.Cell1    v.Cell2   v.Cell3   v.Cell4
Control  0.45       0.41      0.45      0.41 
Cell1    0.50       0.58      0.48      0.66
Cell2    0.58       0.50      0.58      0.48
Cell3    0.48       0.58      0.50      0.70
Cell4    0.66       0.48      0.70      0.50

I am sure that do.call is the way to go, but I cant work out how to embed one do.call inside another to generate the script... and I can't work out how to then execute the scripts (20 lines in total). The closest I am currently is:

a <- do.call("paste",c("test.1.results <- mutate(test.1.results, P.Better.",list(unlist(test.1.results[,Group]))," = pnorm(Delta, test.1.results['",list(unlist(test.1.results[,Group])),"'][,Delta], SD.diff,lower.tail=TRUE))", sep=""))

Which produces 5 script lines like:

test.1.results <- mutate(test.1.results, P.Better.Cell2 = pnorm(Delta, test.1.results['Cell2'][,Delta], SD.diff,lower.tail=TRUE))

Which only compares one test cell results against itself.. a 0.50 result (difference due to chance). No use what so ever as I need each test compared to each other.

Not sure where to go with this one.

Upvotes: 2

Views: 1326

Answers (1)

Matt Dowle
Matt Dowle

Reputation: 59602

Update: In v1.8.11, FR #2077 is now implemented - set() can now add columns by reference, . From NEWS:

set() is able to add new columns by reference now. For example, set(DT, i=3:5, j="bla", 5L) is equivalent to DT[3:5, bla := 5L]. This was FR #2077. Tests added.


Tasks like this are often easier with set(). To demonstrate, here's a translation of what you have in the question (untested). But I realise you want something different than what you've posted (which I don't quite understand, quickly).

for (i in paste0("Cell",1:4))
  set(DT,                   # the data.table to update/add column by reference
    i=NULL,                 # no row subset, NULL is default anyway
    j=paste("P.Better.",i), # column name or position. must be name when adding
    value = pnorm(DT$Delta, DT[i][,Delta], DT$SD.diff, lower.tail=TRUE)

Note that you can add only a subset of a new column and the rest will be filled with NA. Both with := and set.

Upvotes: 3

Related Questions