Reputation: 817
Having a complete brainfreeze with what should be a simple pandas routine - but I am not having luck finding the answer.
I have a dataframe with three columns of interest for this - example below.
All I would like to do is add a new column called 'Pct_Day' that takes the percentage of each ID-date amount and divides by the total amount across all IDs for that date.
So for the example below, the result for the first row (8/7/17, ID: 553, Amount: 3) would be 0.25000 (3/(3+4+5))
Thank you for your help - need to get back in the Pandas swing...
df_dict={'DateOf': ['2017-08-07','2017-08-07','2017-08-07','2017-08-04','2017-08-04','2017-08-04'
, '2017-08-03','2017-08-03','2017-08-03'], 'ID': ['553','559','914','553','559','914','553','559','914'], 'Amount': [3, 4, 5, 9, 11, 10, 3, 9, 10]}
df=pd.DataFrame(df_dict)
Upvotes: 1
Views: 621
Reputation: 3399
Here are the steps:
Step 1: Calculate the sum of Amount
for each Date
df['sum'] = df['Amount'].groupby(df['DateOf']).transform('sum')
Step 2: Calculate the percentage using the sum
df['pct_day'] = df['Amount'] / df['sum'] * 100
Output:
Amount DateOf ID sum pct_day
0 3 2017-08-07 553 12 25.000000
1 4 2017-08-07 559 12 33.333333
2 5 2017-08-07 914 12 41.666667
3 9 2017-08-04 553 30 30.000000
4 11 2017-08-04 559 30 36.666667
5 10 2017-08-04 914 30 33.333333
6 3 2017-08-03 553 22 13.636364
7 9 2017-08-03 559 22 40.909091
8 10 2017-08-03 914 22 45.454545
Finally,
Step 3: Drop the sum column
df.drop('sum', axis=1, inplace=True)
You can also do it in one line like this:
df['pct_day'] = df['Amount'] / df['Amount'].groupby(df['DateOf']).transform('sum') * 100
Upvotes: 3
Reputation: 323226
You just need transform
and div
df['New']=df.Amount/df.groupby(['DateOf']).Amount.transform('sum')
df
Out[51]:
Amount DateOf ID New
0 3 2017-08-07 553 0.250000
1 4 2017-08-07 559 0.333333
2 5 2017-08-07 914 0.416667
3 9 2017-08-04 553 0.300000
4 11 2017-08-04 559 0.366667
5 10 2017-08-04 914 0.333333
6 3 2017-08-03 553 0.136364
7 9 2017-08-03 559 0.409091
8 10 2017-08-03 914 0.454545
Upvotes: 3