Reputation: 1530
I have two columns in my pandas_df. Category and Amount. My data looks like this:
category amount
home 20
home 10
fashion 20
fashion 10
celebrity 30
celebrity 40
I want to group the category column and get the sums for each category. I would also need to know the percentage for each category.
Expected output:
home 30 - 23% etc
My code:
dict(df.groupby(['category'])['amount'].sum().sort_values(ascending=False))
Output:
home 30 fashion 30 celebrity 70
Upvotes: 1
Views: 1643
Reputation: 26676
groupby, agg sum and calculate percentage on the resulting sum.
g=df.groupby('category').agg(Sum=('amount','sum')).reset_index()#Calculate sum
g.assign(per=(g.Sum/(g.Sum.sum())*100).astype(int))#Calc the Percentage
category Sum per
0 celebrity 70 53
1 fashion 30 23
2 home 30 23
Upvotes: 1
Reputation: 327
I would first create a "percent" column:
df['percent'] = df['amount'] / sum(df['amount'])
Then, you can group by category and get the desired output, rounded to 2 decimal places:
df.groupby(['category']).sum().round(2)
The output will be:
amount percent
category
celebrity 70 0.54
fashion 30 0.23
home 30 0.23
Depending on your business case, it may be valuable to have the "percent" column for future calculations as the one you are doing. Therefore, including such column as part of your dataset may be reasonable.
Upvotes: 2