Reputation: 6483
I have the following dataframe (df
)
AmountNeeded AmountAvailable
Source Target
1 2 290.0 600.0
4 300.0 600.0
6 200.0 600.0
3 2 290.0 450.0
5 100.0 450.0
7 8 0.0 500.0
I would like to compute the remaining availability per source:
AmountNeeded AmountAvailable RemainingAvailability
Source Target
1 2 290.0 600.0 600
4 300.0 600.0 310
6 200.0 600.0 10
3 2 290.0 450.0 450
5 100.0 450.0 160
7 8 0.0 500.0 500
So if a Source
appears more than once, I need to subtract the sum of lagged values of AmountNeeded
for that particular Source
.
If we take Source 1
and Target 4
the remaining amount should be AmountAvailable-AmountNeeded(previous_row) = 600 - 290 = 310
If we move to Source 1
and Target 6
this will be: 600 - (290+300) = 10
.
This also be computed as RemainingAvailability - AmountNeeded = 310 - 300 = 10
I tried to use different combinations of groupby and diff but without much success.
Upvotes: 0
Views: 565
Reputation: 862851
Use Series.sub
with helper Series created by lambda function with Series.shift
and cumulative sum Series.cumsum
:
s = df.groupby(level=0)['AmountNeeded'].apply(lambda x: x.shift(fill_value=0).cumsum())
df['RemainingAvailability'] = df['AmountAvailable'].sub(s)
print (df)
AmountNeeded AmountAvailable RemainingAvailability
Source Target
1 2 290.0 600.0 600.0
4 300.0 600.0 310.0
6 200.0 600.0 10.0
3 2 290.0 450.0 450.0
5 100.0 450.0 160.0
7 8 0.0 500.0 500.0
Upvotes: 1