Reputation: 345
I'm trying to calculate a column using lagging row numbers from several columns including the column I calculate. In effect, I want a self-referencing cumulative formula.
An example:
library(data.table)
library(lubridate)
dt <- data.table(date = today()+1:10, supply = c(1,1,2,3,1,0,1,2,1,3), demand = c(0,0,1,0,3,2,1,0,1,0), inventory = 0)
date supply demand inventory
1: 2018-10-18 1 0 0
2: 2018-10-19 1 0 0
3: 2018-10-20 2 1 0
4: 2018-10-21 3 0 0
5: 2018-10-22 1 3 0
6: 2018-10-23 0 2 0
7: 2018-10-24 1 1 0
8: 2018-10-25 2 0 0
9: 2018-10-26 1 1 0
10: 2018-10-27 3 0 0
What I'm trying to achieve is an inventory calculation referring to previous period inventory + supply - demand:
date supply demand inventory
1: 2018-10-18 1 0 0
2: 2018-10-19 1 0 1
3: 2018-10-20 2 1 2
4: 2018-10-21 3 0 3
5: 2018-10-22 1 3 6
6: 2018-10-23 0 2 4
7: 2018-10-24 1 1 2
8: 2018-10-25 2 0 2
9: 2018-10-26 1 1 4
10: 2018-10-27 3 0 4
However, when I try:
dt[,inventory := shift(inventory, fill = 0) + shift(supply, fill = 0) - shift(demand, fill = 0)]
I get:
date supply demand inventory
1: 2018-10-18 1 0 0
2: 2018-10-19 1 0 1
3: 2018-10-20 2 1 1
4: 2018-10-21 3 0 1
5: 2018-10-22 1 3 3
6: 2018-10-23 0 2 -2
7: 2018-10-24 1 1 -2
8: 2018-10-25 2 0 0
9: 2018-10-26 1 1 2
10: 2018-10-27 3 0 0
because shift is referring to a previously stored number, rather than the newly calculated one.
I could resolve this using a loop, but it doesn't appeal to me as an elegant solution. Is there any other way to get there?
Upvotes: 1
Views: 169
Reputation: 345
Ok, I think I've found a 'cheat' way to get around it:
dt[,':=' (cum_supply = cumsum(supply), cum_demand = cumsum(demand))]
dt[,inventory := dt[date==today()+1,inventory] + shift(cum_supply,fill = 0) - shift(cum_demand,fill = 0)]
However, if anyone can suggest a more elegant solution, I'd love to see it!
Upvotes: 1