ErrHuman
ErrHuman

Reputation: 345

Calculate a column using self-referencing lagging formula in R

I'm trying to calculate a column using lagging row numbers from several columns including the column I calculate. In effect, I want a self-referencing cumulative formula.

An example:

library(data.table)
library(lubridate)

dt <- data.table(date = today()+1:10, supply = c(1,1,2,3,1,0,1,2,1,3), demand = c(0,0,1,0,3,2,1,0,1,0), inventory = 0)
          date supply demand inventory
 1: 2018-10-18      1      0         0
 2: 2018-10-19      1      0         0
 3: 2018-10-20      2      1         0
 4: 2018-10-21      3      0         0
 5: 2018-10-22      1      3         0
 6: 2018-10-23      0      2         0
 7: 2018-10-24      1      1         0
 8: 2018-10-25      2      0         0
 9: 2018-10-26      1      1         0
10: 2018-10-27      3      0         0

What I'm trying to achieve is an inventory calculation referring to previous period inventory + supply - demand:

          date supply demand inventory
 1: 2018-10-18      1      0         0
 2: 2018-10-19      1      0         1
 3: 2018-10-20      2      1         2
 4: 2018-10-21      3      0         3
 5: 2018-10-22      1      3         6
 6: 2018-10-23      0      2         4
 7: 2018-10-24      1      1         2
 8: 2018-10-25      2      0         2
 9: 2018-10-26      1      1         4
10: 2018-10-27      3      0         4

However, when I try:

dt[,inventory := shift(inventory, fill = 0) + shift(supply, fill = 0) - shift(demand, fill = 0)]

I get:

          date supply demand inventory
 1: 2018-10-18      1      0         0
 2: 2018-10-19      1      0         1
 3: 2018-10-20      2      1         1
 4: 2018-10-21      3      0         1
 5: 2018-10-22      1      3         3
 6: 2018-10-23      0      2        -2
 7: 2018-10-24      1      1        -2
 8: 2018-10-25      2      0         0
 9: 2018-10-26      1      1         2
10: 2018-10-27      3      0         0

because shift is referring to a previously stored number, rather than the newly calculated one.

I could resolve this using a loop, but it doesn't appeal to me as an elegant solution. Is there any other way to get there?

Upvotes: 1

Views: 169

Answers (1)

ErrHuman
ErrHuman

Reputation: 345

Ok, I think I've found a 'cheat' way to get around it:

dt[,':=' (cum_supply = cumsum(supply), cum_demand = cumsum(demand))]
dt[,inventory := dt[date==today()+1,inventory] + shift(cum_supply,fill = 0) - shift(cum_demand,fill = 0)]

However, if anyone can suggest a more elegant solution, I'd love to see it!

Upvotes: 1

Related Questions