Reputation: 568
data:
test <- structure(list(fgu_clms = c(14621389.763697, 145818119.352026,
21565415.2337476, 20120830.8221406, 12999772.0950838), loss_to_layer = c(0,
125818119.352026, 1565415.23374765, 120830.822140567, 0)), row.names = c(NA,
5L), class = "data.frame")
> test
fgu_clms loss_to_layer
1 14621390 0.0
2 145818119 125818119.4
3 21565415 1565415.2
4 20120831 120830.8
5 12999772 0.0
I want to create a new column which tries to use a cumulative sum on the rows above it. It's easier if I show how the calculation on the new column works row by row:
row 1: first calculate the sum the value of rows above in the same column. As this is row 1 there are no rows above this value is 0, call this cumsum_1. It should then take the minimum of the value of row 1 in column "loss_to_layer" and the calculation "x2 - cumsum_1".
In row 2: calculate the cumsum by looking at the value above, i.e. min(x2-cumsum_1,loss_to_layer value). Call this cumsum_2. Then repeat as above, i.e. be subject to the minimum of the value on row 2 of the loss-to_layer column and x2 - cumsum_2.
And so on.
In excel, this would be done by using MIN(B2,x2 - SUM(C$1:C1)) and dragging this formula down.
The results with x2 = 127,000,000 should be:
fgu_clms loss_to_layer new_col
1 14621390 0.0 0
2 145818119 125818119.4 125818119
3 21565415 1565415.2 1181881
4 20120831 120830.8 0
5 12999772 0.0 0
As you can see the sum of the "new_col" always sums back up to "x2", in this case 127,000,000.
I have tried:
test <- test %>% mutate(new_col = pmin(loss_to_layer,127e6-cumsum(lag(new_col,1,default=0))))
But get an error as it cannot find the column new_col in the lag function
Upvotes: 0
Views: 1205
Reputation: 5232
test %>%
mutate(
cumsum_1 = cumsum(lag(loss_to_layer, default = 0)),
new_col = pmin(loss_to_layer, 127000000 - cumsum_1),
new_col = ifelse(new_col < 0, 0, new_col)
) %>%
select(-cumsum_1)
Upvotes: 1