Reputation: 4363
i created the following dataframe by using pandas melt and groupby with value and variable. I used the following:
df2 = pd.melt(df1).groupby(['value','variable'])['variable'].count().unstack('variable').fillna(0)
Percentile Percentile1 Percentile2 Percentile3
value
None 0 16 32 48
bottom 0 69 85 88
top 0 69 88 82
mediocre 414 260 209 196
I'm looking to create an output that excludes the 'None' row and creates a percentage of the sum of the 'bottom', 'top', and 'mediocre' rows. Desire output would be the following.
Percentile Percentile1 Percentile2 Percentile3
value
bottom 0% 17.3% 22.3% 24.0%
top 0% 17.3% 23.0% 22.4%
mediocre 414% 65.3% 54.7% 53.6%
one of the main parts of this that i'm struggling with is creating a new row to equal an output. any help would be greatly appreciated!
Upvotes: 3
Views: 4581
Reputation: 60230
You can drop the 'None'
row like this:
df2 = df2.drop('None')
If you don't want it permanently dropped you don't have to assign that result back to
df2
.
Then you get your desired output with:
df2.apply(lambda c: c / c.sum() * 100, axis=0)
Out[11]:
Percentile1 Percentile2 Percentile3
value
bottom 17.336683 22.251309 24.043716
top 17.336683 23.036649 22.404372
mediocre 65.326633 54.712042 53.551913
To just get straight to that result without permanently dropping the None
row:
df2.drop('None').apply(lambda c: c / c.sum() * 100, axis=0)
Upvotes: 10