Reputation: 95
I would like somebody help me with the following task I’m trying to resolve. Please see below mock-up of my dataframe.
df=
Date, Investor, Payment, Investor, Payment,
25/02/2018, Investor1, 100,000, Investor2, 200,000
25/04/2020, Investor1, 110,000, Investor2, 220,000
I would to get sum of all payments for a given date, i.e.
SUM all payments for a given date where column name is df[‘Payment’]. Note that column Name Payment repeats multiple times.
Many thanks for your help.
Upvotes: 0
Views: 602
Reputation: 34046
You can do this:
First combine duplicate columns to one column by sum()
:
In [556]: df
Out[556]:
Date Investor Payment Investor Payment
0 25/02/2018 Investor1 100000 Investor2 200000
1 25/04/2020 Investor1 110000 Investor2 220000
In [563]: df = df.groupby(df.columns, axis=1).sum()
In [562]: df
Out[562]:
Date Investor Payment
0 25/02/2018 Investor1 Investor2 300000
1 25/04/2020 Investor1 Investor2 330000
Then, group on Date
and find sum of Payment
:
In [564]: df.groupby('Date')['Payment'].sum()
Out[564]:
Date
25/02/2018 300000
25/04/2020 330000
Name: Payment, dtype: int64
Upvotes: 2