Mostafa90
Mostafa90

Reputation: 1706

Sum in data.table grouping with condition

I'm not use to work with data.table and need some help for doing this type of operations

My data :

library(data.table)

x = c(rep('a', 3), rep('b', 4), 'c')

y = c(1, 2, 1, 4, 4, 2, 4, 5)

dt = data.frame(x , y)

My operation : I want to groupby the x variable, and sum on unique value of y

setDT(dt)[, sm := sum(y), by = list(x)]

The output is :

   x y sm
1: a 1  4
2: a 2  4
3: a 1  4
4: b 4 14
5: b 4 14
6: b 2 14
7: b 4 14
8: c 5  5

But I want :

   x y sm
1: a 1  3
2: a 2  3
3: a 1  3
4: b 4  6
5: b 4  6
6: b 2  6
7: b 4  6
8: c 5  5

I probably have to use the .SD but I dont know how !

Thanks for help

Upvotes: 3

Views: 66

Answers (4)

s_baldur
s_baldur

Reputation: 33488

Another solution (convoluted, but fun):

dt[, sm := unique(dt)[, sum(y), x][.SD, on = "x", V1]]

Upvotes: 1

Prahlad
Prahlad

Reputation: 138

library(data.table)
dt[,.(sum(unique(y))),by=x]

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 388982

You could sum unique values.

library(data.table)
setDT(dt)[, sm := sum(unique(y)), x]
dt

#   x y sm
#1: a 1  3
#2: a 2  3
#3: a 1  3
#4: b 4  6
#5: b 4  6
#6: b 2  6
#7: b 4  6
#8: c 5  5

Upvotes: 2

tmfmnk
tmfmnk

Reputation: 39858

One option could be:

setDT(dt)[, sm := sum(y[!duplicated(y)]), by = x]

   x y sm
1: a 1  3
2: a 2  3
3: a 1  3
4: b 4  6
5: b 4  6
6: b 2  6
7: b 4  6
8: c 5  5

Upvotes: 1

Related Questions