percentage of sum in dataframe pandas

Question

i created the following dataframe by using pandas melt and groupby with value and variable. I used the following:

df2 = pd.melt(df1).groupby(['value','variable'])['variable'].count().unstack('variable').fillna(0)

         Percentile     Percentile1     Percentile2     Percentile3
value                                               
None          0             16              32              48
bottom        0             69              85              88  
top           0             69              88              82  
mediocre     414           260             209             196

I'm looking to create an output that excludes the 'None' row and creates a percentage of the sum of the 'bottom', 'top', and 'mediocre' rows. Desire output would be the following.

         Percentile     Percentile1     Percentile2     Percentile3
value                                               
bottom        0%          17.3%             22.3%              24.0%    
top           0%          17.3%             23.0%              22.4%    
mediocre     414%         65.3%             54.7%              53.6%

one of the main parts of this that i'm struggling with is creating a new row to equal an output. any help would be greatly appreciated!

Marius · Accepted Answer

You can drop the 'None' row like this:

df2 = df2.drop('None')

If you don't want it permanently dropped you don't have to assign that result back to df2.

Then you get your desired output with:

df2.apply(lambda c: c / c.sum() * 100, axis=0)
Out[11]: 
          Percentile1  Percentile2  Percentile3
value                                          
bottom      17.336683    22.251309    24.043716
top         17.336683    23.036649    22.404372
mediocre    65.326633    54.712042    53.551913

To just get straight to that result without permanently dropping the None row:

df2.drop('None').apply(lambda c: c / c.sum() * 100, axis=0)

percentage of sum in dataframe pandas

Answers (1)

Related Questions