Reputation: 1706
I'm not use to work with data.table and need some help for doing this type of operations
My data :
library(data.table)
x = c(rep('a', 3), rep('b', 4), 'c')
y = c(1, 2, 1, 4, 4, 2, 4, 5)
dt = data.frame(x , y)
My operation : I want to groupby the x variable, and sum on unique value of y
setDT(dt)[, sm := sum(y), by = list(x)]
The output is :
x y sm
1: a 1 4
2: a 2 4
3: a 1 4
4: b 4 14
5: b 4 14
6: b 2 14
7: b 4 14
8: c 5 5
But I want :
x y sm
1: a 1 3
2: a 2 3
3: a 1 3
4: b 4 6
5: b 4 6
6: b 2 6
7: b 4 6
8: c 5 5
I probably have to use the .SD
but I dont know how !
Thanks for help
Upvotes: 3
Views: 66
Reputation: 33488
Another solution (convoluted, but fun):
dt[, sm := unique(dt)[, sum(y), x][.SD, on = "x", V1]]
Upvotes: 1
Reputation: 388982
You could sum
unique
values.
library(data.table)
setDT(dt)[, sm := sum(unique(y)), x]
dt
# x y sm
#1: a 1 3
#2: a 2 3
#3: a 1 3
#4: b 4 6
#5: b 4 6
#6: b 2 6
#7: b 4 6
#8: c 5 5
Upvotes: 2
Reputation: 39858
One option could be:
setDT(dt)[, sm := sum(y[!duplicated(y)]), by = x]
x y sm
1: a 1 3
2: a 2 3
3: a 1 3
4: b 4 6
5: b 4 6
6: b 2 6
7: b 4 6
8: c 5 5
Upvotes: 1