Chique_Code
Chique_Code

Reputation: 1530

get the percentage of a grouped values based on another column pands python

I have two columns in my pandas_df. Category and Amount. My data looks like this:

category          amount
home              20
home              10
fashion           20
fashion           10
celebrity         30
celebrity         40

I want to group the category column and get the sums for each category. I would also need to know the percentage for each category.

Expected output: home 30 - 23% etc

My code:

dict(df.groupby(['category'])['amount'].sum().sort_values(ascending=False))

Output: home 30 fashion 30 celebrity 70

Upvotes: 1

Views: 1643

Answers (2)

wwnde
wwnde

Reputation: 26676

groupby, agg sum and calculate percentage on the resulting sum.

 g=df.groupby('category').agg(Sum=('amount','sum')).reset_index()#Calculate sum

g.assign(per=(g.Sum/(g.Sum.sum())*100).astype(int))#Calc the Percentage

    category  Sum  per
0  celebrity   70   53
1    fashion   30   23
2       home   30   23

Upvotes: 1

Arsik36
Arsik36

Reputation: 327

I would first create a "percent" column:

df['percent'] = df['amount'] / sum(df['amount'])

Then, you can group by category and get the desired output, rounded to 2 decimal places:

df.groupby(['category']).sum().round(2)

The output will be:

          amount    percent
category
    
celebrity   70          0.54
fashion     30          0.23
home        30          0.23

Depending on your business case, it may be valuable to have the "percent" column for future calculations as the one you are doing. Therefore, including such column as part of your dataset may be reasonable.

Upvotes: 2

Related Questions