Calculating percentage in Python Pandas library

Question

I have a Pandas dataframe like this:

import pandas as pd
df = pd.DataFrame(
    {'gender':['F','F','F','F','F','M','M','M','M','M'],
     'mature':[0,1,0,0,0,1,1,1,0,1],
     'cta'   :[1,1,0,1,0,0,0,1,0,1]}
)

df['gender'] = df['gender'].astype('category')
df['mature'] = df['mature'].astype('category')

df['cta']    = pd.to_numeric(df['cta'])
df

I calculated the sum (How many times people clicked) and total (the number of sent messages). I want to figure out how to calculate the percentage defined as clicks/total and how to get a dataframe as output.

temp_groupby = df.groupby('gender').agg({'cta': [('clicks','sum'),
                                  ('total','count')]})
temp_groupby

jezrael · Accepted Answer

I think it means you need average, add new tuple to list like:

temp_groupby = df.groupby('gender').agg({'cta': [('clicks','sum'),
                                                 ('total','count'),
                                                 ('perc', 'mean')]})
print (temp_groupby)
          cta           
       clicks total perc
gender                  
F           3     5  0.6
M           2     5  0.4

For avoid MultiIndex in columns specify column after groupby:

temp_groupby = df.groupby('gender')['cta'].agg([('clicks','sum'),
                                                ('total','count'),
                                                ('perc', 'mean')]).reset_index()
print (temp_groupby)
  gender  clicks  total  perc
0      F       3      5   0.6
1      M       2      5   0.4

Or use named aggregation:

temp_groupby = df.groupby('gender', as_index=False).agg(clicks= ('cta','sum'),
                                                        total= ('cta','count'),
                                                        perc= ('cta','mean'))
print (temp_groupby)
  gender  clicks  total  perc
0      F       3      5   0.6
1      M       2      5   0.4

Calculating percentage in Python Pandas library

Answers (1)

Related Questions