hdrajani
hdrajani

Reputation: 73

Subtract row from another row of same column with pandas

I want to subtract row value from another row value of a same column using pandas. My dataframe:

holderName    policyItemName    writtenFee    policyNo.   writtenPremium

Robert Nelson   Policy Fee         25        2017-5124     10
Robert Nelson   Policy Fee         25        2017-5124     12
Robert Nelson   policy Fee         25        2017-5124     54
Robert Nelson   Policy Fee         25        2017-5124     123
Karen Jordan    Policy Fee         25        2017-1289     321
Karen Jordan    Policy Fee         25        2017-1289     500
Karen Jordan    Policy Fee         25        2017-1289     400

I want to subtract 'written premium' from up to down, Like first row's premium remains unchanged, 2nd row's 'written premium' to be subtracted from 3rd row's premium and it will become 2nd row's premium and so on.I want this only for those rows that are having same 'policy number'. the answer can be added to another column.

Output required:

holderName policyItemName writtenFee policyNo. writenPremium  derivedPremium

Robert Nelson   Policy Fee   25     2017-5124   10             10 
Robert Nelson   Policy Fee   25     2017-5124   12             12-10=2
Robert Nelson   Policy Fee   25     2017-5124   54             54-12=42
Robert Nelson   Policy Fee   25     2017-5124   123            123-54=69
Karen Jordan    Policy Fee   25     2017-1289   30             30
Karen Jordan    Policy Fee   25     2017-1289   50             50-30=20
Karen Jordan    Policy Fee   25     2017-1289   40             40-50=-10

Any kind of help is highly appreciated, Thanks.

Upvotes: 3

Views: 3901

Answers (1)

jezrael
jezrael

Reputation: 862551

Use DataFrameGroupBy.diff with fillna for replace first NaNs:

df['derivedPremium'] = (df.groupby(['policyNo.'])['writtenPremium']
                          .diff()
                          .fillna(df['writtenPremium']))
print (df)

       olderName policyItemName  writtenFee  policyNo.  writtenPremium  \
0  Robert Nelson     Policy Fee          25  2017-5124              10   
1  Robert Nelson     Policy Fee          25  2017-5124              12   
2  Robert Nelson     policy Fee          25  2017-5124              54   
3  Robert Nelson     Policy Fee          25  2017-5124             123   
4   Karen Jordan     Policy Fee          25  2017-1289              30   
5   Karen Jordan     Policy Fee          25  2017-1289              50   
6   Karen Jordan     Policy Fee          25  2017-1289              40   

   derivedPremium  
0            10.0  
1             2.0  
2            42.0  
3            69.0  
4            30.0  
5            20.0  
6           -10.0  

If working only with integers last step is converting:

df['derivedPremium'] = (df.groupby(['policyNo.'])['writtenPremium']
                          .diff()
                          .fillna(df['writtenPremium'])
                          .astype(int))
print (df)

       olderName policyItemName  writtenFee  policyNo.  writtenPremium  \
0  Robert Nelson     Policy Fee          25  2017-5124              10   
1  Robert Nelson     Policy Fee          25  2017-5124              12   
2  Robert Nelson     policy Fee          25  2017-5124              54   
3  Robert Nelson     Policy Fee          25  2017-5124             123   
4   Karen Jordan     Policy Fee          25  2017-1289              30   
5   Karen Jordan     Policy Fee          25  2017-1289              50   
6   Karen Jordan     Policy Fee          25  2017-1289              40   

   derivedPremium  
0              10  
1               2  
2              42  
3              69  
4              30  
5              20  
6             -10  

Upvotes: 4

Related Questions