Reputation: 1318
I am trying to calculate excess cash based on withdrawn amount and replenished amount column in a pandas data frame.
To calculate excess cash the logic is whenever there has been replenishment take difference of replenished amount with withdrawn amount. Rows where no replenishment has taken place use previous row's excess cash to take difference with the amount withdrawn of current row.
Example:
Withdrawn Replenished
0 10 80
1 20 0
2 30 0
3 10 0
4 20 30
As shown in above table I've used the following code to calculate excess cash based on logic that whenever there has been replenishment take difference of Replenished column with Withdrawn amount.
df['Excess'] = 0
df['Excess'] = np.where(df['Replenished'] > 0, df['Replenished'] - df['Withdrawn'], 0)
Now the second part is where I am facing the problem. Rows where Replenished column is 0
. I need to use previous row's excess cash and take difference with current row's withdrawn amount. For that I've used the following code:
df['Excess'] = np.where(df['Replenished'] == 0, df['Excess'].shift(1) - df['Withdrawn'], df['Excess'])
It results in the following table:
Withdrawn Replenished Excess
0 10 80 70.0
1 20 0 50.0
2 30 0 -30.0
3 10 0 -10.0
4 20 30 10.0
For the first time when Replenished column is 0
(2nd Row) the calculation works correctly. But in the third and fourth row where value in column named Replenished is 0
, since the previous row is 0
after subtracting previous row's excess cash with current row's withdrawn value, I get -30
and -10
respectively.
Following is the desired output:
+-----------+-------------+--------+
| Withdrawn | Replenished | Excess |
+-----------+-------------+--------+
| 10 | 80 | 70 |
| 20 | 0 | 50 |
| 30 | 0 | 20 |
| 10 | 0 | 10 |
| 20 | 30 | 10 |
+-----------+-------------+--------+
Upvotes: 0
Views: 332
Reputation: 402814
This will work for multiple Replenishments as well. Perform a cumulative sum of "Replenished" and "Withdrawn", then subtract one from the other:
cum_deposit = df.groupby(df['Replenished'].ne(0).cumsum())['Withdrawn'].cumsum()
df['Excess'] = df['Replenished'].replace(0, np.nan).ffill() - cum_deposit
df
Withdrawn Replenished Excess
0 10 80 70.0
1 20 0 50.0
2 30 0 20.0
3 10 0 10.0
4 20 30 10.0
Upvotes: 2