Reputation: 190
I have a small data.table
representing one record per test cell (AB testing results) and am wanting to add several more columns that compare each test cell, against each other test cell. In other words, the number of columns I want to add, will depend upon how many test cells are in the AB test in question.
My data.table
looks like:
Group Delta SD.diff
Control 0 0
Cell1 0.00200 0.001096139
Cell2 0.00196 0.001095797
Cell3 0.00210 0.001096992
Cell4 0.00160 0.001092716
And I want to add the following columns (numbers are trash here):
Group v.Cell1 v.Cell2 v.Cell3 v.Cell4
Control 0.45 0.41 0.45 0.41
Cell1 0.50 0.58 0.48 0.66
Cell2 0.58 0.50 0.58 0.48
Cell3 0.48 0.58 0.50 0.70
Cell4 0.66 0.48 0.70 0.50
I am sure that do.call
is the way to go, but I cant work out how to embed one do.call inside another to generate the script... and I can't work out how to then execute the scripts (20 lines in total). The closest I am currently is:
a <- do.call("paste",c("test.1.results <- mutate(test.1.results, P.Better.",list(unlist(test.1.results[,Group]))," = pnorm(Delta, test.1.results['",list(unlist(test.1.results[,Group])),"'][,Delta], SD.diff,lower.tail=TRUE))", sep=""))
Which produces 5 script lines like:
test.1.results <- mutate(test.1.results, P.Better.Cell2 = pnorm(Delta, test.1.results['Cell2'][,Delta], SD.diff,lower.tail=TRUE))
Which only compares one test cell results against itself.. a 0.50 result (difference due to chance). No use what so ever as I need each test compared to each other.
Not sure where to go with this one.
Upvotes: 2
Views: 1326
Reputation: 59602
Update: In v1.8.11, FR #2077 is now implemented - set()
can now add columns by reference, . From NEWS:
set()
is able to add new columns by reference now. For example,set(DT, i=3:5, j="bla", 5L)
is equivalent toDT[3:5, bla := 5L]
. This wasFR #2077
. Tests added.
Tasks like this are often easier with set()
. To demonstrate, here's a translation of what you have in the question (untested). But I realise you want something different than what you've posted (which I don't quite understand, quickly).
for (i in paste0("Cell",1:4))
set(DT, # the data.table to update/add column by reference
i=NULL, # no row subset, NULL is default anyway
j=paste("P.Better.",i), # column name or position. must be name when adding
value = pnorm(DT$Delta, DT[i][,Delta], DT$SD.diff, lower.tail=TRUE)
Note that you can add only a subset of a new column and the rest will be filled with NA. Both with :=
and set
.
Upvotes: 3