alex
alex

Reputation: 345

Persistent assignment in data.table with .SD

I'm struggling with .SD calls in data.table.

In particular, I'm trying to identify some logical characteristic within a grouping of data, and draw some identifying mark in another variable. Canonical application of .SD, right?

From FAQ 4.5, http://cran.r-project.org/web/packages/data.table/vignettes/datatable-faq.pdf, imagine the following table:

library(data.table) # 1.9.5

DT = data.table(a=rep(1:3,1:3),b=1:6,c=7:12)
DT[,{ mySD = copy(.SD)
      mySD[1, b := 99L]
      mySD },
    by = a]
##   a  b  c
## 1: 1 99  7
## 2: 2 99  8
## 3: 2  3  9
## 4: 3 99 10
## 5: 3  5 11
## 6: 3  6 12

I've assigned these values to b (using the ':=' operator) and so when I re-call DT, I expect the same output. But, unexpectedly, I'm met with the original table:

DT
##    a b  c
## 1: 1 1  7
## 2: 2 2  8
## 3: 2 3  9
## 4: 3 4 10
## 5: 3 5 11
## 6: 3 6 12

Expected output was the original frame, with persistent modifications in 'b':

DT
##   a  b  c
## 1: 1 99  7
## 2: 2 99  8
## 3: 2  3  9
## 4: 3 99 10
## 5: 3  5 11
## 6: 3  6 12

Sure, I can copy this table into another one, but that doesn't seem consistent with the ethos.

DT2 <- copy(DT[,{ mySD = copy(.SD)
                  mySD[1, b := 99L]
                  mySD },
               by = a])
DT2
##   a  b  c
## 1: 1 99  7
## 2: 2 99  8
## 3: 2  3  9
## 4: 3 99 10
## 5: 3  5 11
## 6: 3  6 12

It feels like I'm missing something fundamental here.

Upvotes: 3

Views: 93

Answers (1)

David Arenburg
David Arenburg

Reputation: 92282

The mentioned FAQ is just showing a workaround on how to modify (a temprory copy of) .SD but it won't update your original data in place. A possible solution for you problem would be something like

DT[DT[, .I[1L], by = a]$V1, b := 99L]
DT
#    a  b  c
# 1: 1 99  7
# 2: 2 99  8
# 3: 2  3  9
# 4: 3 99 10
# 5: 3  5 11
# 6: 3  6 12

Upvotes: 6

Related Questions