Reputation: 345
I'm struggling with .SD calls in data.table.
In particular, I'm trying to identify some logical characteristic within a grouping of data, and draw some identifying mark in another variable. Canonical application of .SD, right?
From FAQ 4.5, http://cran.r-project.org/web/packages/data.table/vignettes/datatable-faq.pdf, imagine the following table:
library(data.table) # 1.9.5
DT = data.table(a=rep(1:3,1:3),b=1:6,c=7:12)
DT[,{ mySD = copy(.SD)
mySD[1, b := 99L]
mySD },
by = a]
## a b c
## 1: 1 99 7
## 2: 2 99 8
## 3: 2 3 9
## 4: 3 99 10
## 5: 3 5 11
## 6: 3 6 12
I've assigned these values to b (using the ':=' operator) and so when I re-call DT, I expect the same output. But, unexpectedly, I'm met with the original table:
DT
## a b c
## 1: 1 1 7
## 2: 2 2 8
## 3: 2 3 9
## 4: 3 4 10
## 5: 3 5 11
## 6: 3 6 12
Expected output was the original frame, with persistent modifications in 'b':
DT
## a b c
## 1: 1 99 7
## 2: 2 99 8
## 3: 2 3 9
## 4: 3 99 10
## 5: 3 5 11
## 6: 3 6 12
Sure, I can copy this table into another one, but that doesn't seem consistent with the ethos.
DT2 <- copy(DT[,{ mySD = copy(.SD)
mySD[1, b := 99L]
mySD },
by = a])
DT2
## a b c
## 1: 1 99 7
## 2: 2 99 8
## 3: 2 3 9
## 4: 3 99 10
## 5: 3 5 11
## 6: 3 6 12
It feels like I'm missing something fundamental here.
Upvotes: 3
Views: 93
Reputation: 92282
The mentioned FAQ is just showing a workaround on how to modify (a temprory copy of) .SD
but it won't update your original data in place. A possible solution for you problem would be something like
DT[DT[, .I[1L], by = a]$V1, b := 99L]
DT
# a b c
# 1: 1 99 7
# 2: 2 99 8
# 3: 2 3 9
# 4: 3 99 10
# 5: 3 5 11
# 6: 3 6 12
Upvotes: 6