Reputation: 1420

Group sum on 2 variables and subtraction from total

I have a data.table of the following form:-

library(data.table)
b <- data.table(Code_Nm = c(rep("a", 3), rep("b", 3), rep("d", 3)),
            Year = rep(2011:2013, 3), total = rep(10, 9),
            a = c(rep(0, 3), 1:6), b = c(7:9, 0, 0, 0, 10:12), d = c(13:18, 0, 0, 0))
b

#   Code_Nm Year total a  b  d
#1:       a 2011    10 0  7 13
#2:       a 2012    10 0  8 14
#3:       a 2013    10 0  9 15
#4:       b 2011    10 1  0 16
#5:       b 2012    10 2  0 17
#6:       b 2013    10 3  0 18
#7:       d 2011    10 4 10  0
#8:       d 2012    10 5 11  0
#9:       d 2013    10 6 12  0

I want to get a data.table such that the last column named actual has the value equal to the total minus the sum of yearly groups of each Code_Nm. I.e. for a in 2011 it is 10 minus sum of all the 2011 values but from column a. This is equal to 10 - (1+4).

Similarly for a in 2012, it is 10 minus sum of all 2012 values but from column a. This is equal to 10 - (2+5).

Similary for b in 2011 it is 10 minus sum of all the 2011 values but from column b. This is equal to 10 - (7+10). For b in 2012, it is 10 minus sum of all 2012 values but from column b. This is equal to 10 - (8+11).

And similary for d column as well.

The final result is the following data.table:-

b <- data.table(Code_Nm = c(rep("a", 3), rep("b", 3), rep("d", 3)),
            Year = rep(2011:2013, 3), total = rep(10, 9),
            a = c(rep(0, 3), 1:6), b = c(7:9, 0, 0, 0, 10:12), d = c(13:18, 0, 0, 0),
            actual = c(5, 3, 1, -7, -9, -11, -19, -21, -23))
b

#   Code_Nm Year total a  b  d actual
#1:       a 2011    10 0  7 13      5
#2:       a 2012    10 0  8 14      3
#3:       a 2013    10 0  9 15      1
#4:       b 2011    10 1  0 16     -7
#5:       b 2012    10 2  0 17     -9
#6:       b 2013    10 3  0 18    -11
#7:       d 2011    10 4 10  0    -19
#8:       d 2012    10 5 11  0    -21
#9:       d 2013    10 6 12  0    -23

Please give a solution using a data.table.

Thanks in advance.

Upvotes: 3

Answers (4)

Peace Wang

Reputation: 2419

Updated:

The idea is I can transpose each group for columns a,b,d, then I can still use the previous method total - rowSum(.SD).

cols1 <- names(b)[4:6]
cols2 <- paste(names(b)[4:6],2,sep = "")

b[,(cols2):= lapply(split.default(as.data.frame(t(.SD)),cols1),unlist),
  by = Year,
  .SDcols = cols1]

b[, actual := total - rowSums(.SD), .SDcols = cols2]
b[,(cols2):=NULL][]

Is this your desired result?

b[,actual := total - ( a + b + d)][]

b[, actual := total - apply(.SD,1,sum),.SDcols = c(4:6)][]

Upvotes: 1

chinsoon12

Reputation: 25225

Here is another option using melt and then update by reference using a join:

m <- melt(b, id.vars="Year", measure.vars=c("a","b","d"))[, sum(value), .(Year, Code_Nm=variable)]
b[, actual := m[.SD, on=.(Code_Nm, Year), i.total - x.V1]]

Upvotes: 0

ThomasIsCoding

Reputation: 102469

You can try the code below

b[, actual := total - colSums(cbind(a, b, d)[, .SD[, Code_Nm]]), Year][]

which gives

   Code_Nm Year total a  b  d actual
1:       a 2011    10 0  7 13      5
2:       a 2012    10 0  8 14      3
3:       a 2013    10 0  9 15      1
4:       b 2011    10 1  0 16     -7
5:       b 2012    10 2  0 17     -9
6:       b 2013    10 3  0 18    -11
7:       d 2011    10 4 10  0    -19
8:       d 2012    10 5 11  0    -21
9:       d 2013    10 6 12  0    -23

Upvotes: 2

akrun

Reputation: 887691

We can use rowSums on the Subset of data.table (.SD) after specifying .SDcols with the columns of interest and subtract from 'total'

b[, actual := total  - rowSums(.SD), .SDcols = a:d]

-output

b
#    Code_Nm Year total a b d actual
#1:       a 2011    10 0 2 3      5
#2:       a 2012    10 0 3 4      3
#3:       a 2013    10 0 4 5      1
#4:       b 2011    10 1 0 6      3
#5:       b 2012    10 2 0 7      1
#6:       b 2013    10 3 0 8     -1
#7:       d 2011    10 4 5 0      1
#8:       d 2012    10 5 6 0     -1
#9:       d 2013    10 6 7 0     -3

Upvotes: 2

Group sum on 2 variables and subtraction from total

Answers (4)

Updated:

Previous

Related Questions