Reputation: 569
it might be that this is trivial, but I'm probably confused.
I have something like this:
set.seed(1234)
dt<-data.table(day=sample(c("day1","day2","day3"),20,replace = T),
store=sample(c("store1","store2","store3"),20,replace=T),
x=rnorm(20,33,6),y=rnorm(20,12,10))
I am interested in aggregating by day and store:
dt[,.(sumx=sum(x),sumy=sum(y)),by=c("day","store")]
day store sumx sumy
1: day1 store2 56.33890 44.52312
2: day2 store1 164.72854 61.37866
3: day3 store3 144.52483 53.74347
4: day1 store3 56.25504 34.00066
5: day3 store1 70.61311 30.85589
6: day2 store3 123.34534 74.67024
7: day2 store2 35.72952 21.19009
But also, more globally, by day only:
dt[,.(sumx=sum(x),sumy=sum(y)),by=day]
day sumx sumy
1: day1 112.5939 78.52378
2: day2 323.8034 157.23899
3: day3 215.1379 84.59936
In practice, I would like to end up with a dataset that has aggregation per day and store, and an additional column with day only aggregation:
day store sumx sumy sumx_daylevel sumy_daylevel
1: day1 store2 56.33890 44.52312 112.5939 78.52378
2: day2 store1 164.72854 61.37866 323.8034 157.23899
3: day3 store3 144.52483 53.74347 215.1379 84.59936
4: day1 store3 56.25504 34.00066 112.5939 78.52378
5: day3 store1 70.61311 30.85589 215.1379 84.59936
6: day2 store3 123.34534 74.67024 323.8034 157.23899
7: day2 store2 35.72952 21.19009 323.8034 157.23899
I'd like to achieve that wrapping everything in a function, no merging. Any help will be appreciated. Thank you
Upvotes: 1
Views: 328
Reputation: 887531
We can use :=
to create new columns
dt[,.(sumx=sum(x),sumy=sum(y)),by=c("day","store")
][, c("sumx_daylevel", "sumy_daylevel") := .(sum(sumx), sum(sumy)), day][]
Upvotes: 2