Reputation: 73
I want to subtract row value from another row value of a same column using pandas. My dataframe:
holderName policyItemName writtenFee policyNo. writtenPremium
Robert Nelson Policy Fee 25 2017-5124 10
Robert Nelson Policy Fee 25 2017-5124 12
Robert Nelson policy Fee 25 2017-5124 54
Robert Nelson Policy Fee 25 2017-5124 123
Karen Jordan Policy Fee 25 2017-1289 321
Karen Jordan Policy Fee 25 2017-1289 500
Karen Jordan Policy Fee 25 2017-1289 400
I want to subtract 'written premium' from up to down, Like first row's premium remains unchanged, 2nd row's 'written premium' to be subtracted from 3rd row's premium and it will become 2nd row's premium and so on.I want this only for those rows that are having same 'policy number'. the answer can be added to another column.
Output required:
holderName policyItemName writtenFee policyNo. writenPremium derivedPremium
Robert Nelson Policy Fee 25 2017-5124 10 10
Robert Nelson Policy Fee 25 2017-5124 12 12-10=2
Robert Nelson Policy Fee 25 2017-5124 54 54-12=42
Robert Nelson Policy Fee 25 2017-5124 123 123-54=69
Karen Jordan Policy Fee 25 2017-1289 30 30
Karen Jordan Policy Fee 25 2017-1289 50 50-30=20
Karen Jordan Policy Fee 25 2017-1289 40 40-50=-10
Any kind of help is highly appreciated, Thanks.
Upvotes: 3
Views: 3901
Reputation: 862551
Use DataFrameGroupBy.diff
with fillna
for replace first NaN
s:
df['derivedPremium'] = (df.groupby(['policyNo.'])['writtenPremium']
.diff()
.fillna(df['writtenPremium']))
print (df)
olderName policyItemName writtenFee policyNo. writtenPremium \
0 Robert Nelson Policy Fee 25 2017-5124 10
1 Robert Nelson Policy Fee 25 2017-5124 12
2 Robert Nelson policy Fee 25 2017-5124 54
3 Robert Nelson Policy Fee 25 2017-5124 123
4 Karen Jordan Policy Fee 25 2017-1289 30
5 Karen Jordan Policy Fee 25 2017-1289 50
6 Karen Jordan Policy Fee 25 2017-1289 40
derivedPremium
0 10.0
1 2.0
2 42.0
3 69.0
4 30.0
5 20.0
6 -10.0
If working only with integer
s last step is converting:
df['derivedPremium'] = (df.groupby(['policyNo.'])['writtenPremium']
.diff()
.fillna(df['writtenPremium'])
.astype(int))
print (df)
olderName policyItemName writtenFee policyNo. writtenPremium \
0 Robert Nelson Policy Fee 25 2017-5124 10
1 Robert Nelson Policy Fee 25 2017-5124 12
2 Robert Nelson policy Fee 25 2017-5124 54
3 Robert Nelson Policy Fee 25 2017-5124 123
4 Karen Jordan Policy Fee 25 2017-1289 30
5 Karen Jordan Policy Fee 25 2017-1289 50
6 Karen Jordan Policy Fee 25 2017-1289 40
derivedPremium
0 10
1 2
2 42
3 69
4 30
5 20
6 -10
Upvotes: 4