Reputation: 1420
I have a data.table
of the following form:-
library(data.table)
b <- data.table(Code_Nm = c(rep("a", 3), rep("b", 3), rep("d", 3)),
Year = rep(2011:2013, 3), total = rep(10, 9),
a = c(rep(0, 3), 1:6), b = c(7:9, 0, 0, 0, 10:12), d = c(13:18, 0, 0, 0))
b
# Code_Nm Year total a b d
#1: a 2011 10 0 7 13
#2: a 2012 10 0 8 14
#3: a 2013 10 0 9 15
#4: b 2011 10 1 0 16
#5: b 2012 10 2 0 17
#6: b 2013 10 3 0 18
#7: d 2011 10 4 10 0
#8: d 2012 10 5 11 0
#9: d 2013 10 6 12 0
I want to get a data.table
such that the last column named actual
has the value equal to the total
minus the sum of yearly groups of each Code_Nm
. I.e. for a
in 2011
it is 10
minus sum of all the 2011
values but from column a
. This is equal to 10 - (1+4)
.
Similarly for a
in 2012
, it is 10
minus sum of all 2012
values but from column a
. This is equal to 10 - (2+5)
.
Similary for b
in 2011
it is 10
minus sum of all the 2011
values but from column b
. This is equal to 10 - (7+10)
. For b
in 2012
, it is 10
minus sum of all 2012
values but from column b
. This is equal to 10 - (8+11)
.
And similary for d
column as well.
The final result is the following data.table
:-
b <- data.table(Code_Nm = c(rep("a", 3), rep("b", 3), rep("d", 3)),
Year = rep(2011:2013, 3), total = rep(10, 9),
a = c(rep(0, 3), 1:6), b = c(7:9, 0, 0, 0, 10:12), d = c(13:18, 0, 0, 0),
actual = c(5, 3, 1, -7, -9, -11, -19, -21, -23))
b
# Code_Nm Year total a b d actual
#1: a 2011 10 0 7 13 5
#2: a 2012 10 0 8 14 3
#3: a 2013 10 0 9 15 1
#4: b 2011 10 1 0 16 -7
#5: b 2012 10 2 0 17 -9
#6: b 2013 10 3 0 18 -11
#7: d 2011 10 4 10 0 -19
#8: d 2012 10 5 11 0 -21
#9: d 2013 10 6 12 0 -23
Please give a solution using a data.table
.
Thanks in advance.
Upvotes: 3
Views: 101
Reputation: 2419
The idea is I can transpose each group for columns a,b,d
, then I can still use the previous method total - rowSum(.SD)
.
cols1 <- names(b)[4:6]
cols2 <- paste(names(b)[4:6],2,sep = "")
b[,(cols2):= lapply(split.default(as.data.frame(t(.SD)),cols1),unlist),
by = Year,
.SDcols = cols1]
b[, actual := total - rowSums(.SD), .SDcols = cols2]
b[,(cols2):=NULL][]
Is this your desired result?
b[,actual := total - ( a + b + d)][]
or
b[, actual := total - apply(.SD,1,sum),.SDcols = c(4:6)][]
Upvotes: 1
Reputation: 25225
Here is another option using melt
and then update by reference using a join:
m <- melt(b, id.vars="Year", measure.vars=c("a","b","d"))[, sum(value), .(Year, Code_Nm=variable)]
b[, actual := m[.SD, on=.(Code_Nm, Year), i.total - x.V1]]
Upvotes: 0
Reputation: 102469
You can try the code below
b[, actual := total - colSums(cbind(a, b, d)[, .SD[, Code_Nm]]), Year][]
which gives
Code_Nm Year total a b d actual
1: a 2011 10 0 7 13 5
2: a 2012 10 0 8 14 3
3: a 2013 10 0 9 15 1
4: b 2011 10 1 0 16 -7
5: b 2012 10 2 0 17 -9
6: b 2013 10 3 0 18 -11
7: d 2011 10 4 10 0 -19
8: d 2012 10 5 11 0 -21
9: d 2013 10 6 12 0 -23
Upvotes: 2
Reputation: 887691
We can use rowSums
on the Subset of data.table (.SD
) after specifying .SDcols
with the columns of interest and subtract from 'total'
b[, actual := total - rowSums(.SD), .SDcols = a:d]
-output
b
# Code_Nm Year total a b d actual
#1: a 2011 10 0 2 3 5
#2: a 2012 10 0 3 4 3
#3: a 2013 10 0 4 5 1
#4: b 2011 10 1 0 6 3
#5: b 2012 10 2 0 7 1
#6: b 2013 10 3 0 8 -1
#7: d 2011 10 4 5 0 1
#8: d 2012 10 5 6 0 -1
#9: d 2013 10 6 7 0 -3
Upvotes: 2