summing all amounts by date in respect with individuals

Question

I have this Dataframe

df:
     payout  person1 person2      date    
1    300.0     LA       NaN     2012-02-01  
2    500.0     DO       NaN     2012-02-01  
3    600.0     DO       NaN     2012-02-01  
4    300.0     DO       NaN     2012-01-01  
5    500.0     DO       NaN     2012-01-01  
6    1000.0    DO       AL      2012-01-01  
7    800.0     DO       AL      2012-01-01

In a spearate Dataframe, I need to sum all payout in each unique month and year for each person1 separately. Then if person2 exists, I need to split the payout (after each month summation) between person1 and person2.
Output should be like this:

df:
         person     date         sum 
    1    LA         2012-02-01    300.0 
    2    DO         2012-02-01    1100.0        
    3    DO         2012-01-01    1700.0
    4    AL         2012-01-01    900.0

Ben.T · Accepted Answer

you can create a column that contains the good amount to sum if there is someone in person2 column with np.where

df['payout_sum'] = np.where(df.person2.notnull(), df.payout/2., df.payout)

Then using concat, groupby and pd.Grouper, you can get the result:

df_tot = (pd.concat([df[['date','person1','payout_sum']].rename(columns={'person1':'person'}),
                     df[['date','person2','payout_sum']].rename(columns={'person2':'person'})
                                                          .dropna()])\
            .groupby([pd.Grouper(key='date', freq='MS'),'person'])['payout_sum']
            .sum().reset_index())
print (df_tot)
        date person  payout_sum
0 2012-01-01     AL       900.0
1 2012-01-01     DA      1700.0
2 2012-02-01     DA      1100.0
3 2012-02-01     LA       300.0

The interest of pd.Grouper with 'MS' is that it will resample by beginning of month, in case you have payout on several days in a month.

summing all amounts by date in respect with individuals

Answers (2)

Related Questions