Reputation: 4820
Assuming that I have a dataframe with the following values:
name start end description
0 ag 20 30 None
1 bgb 21 111 'a'
2 cdd 31 101 None
3 bgb 17 19 None
4 ag 20 22 None
5 ag 1 65 'avc'
I want to groupby name
and then get percent of number of description
which is not null
for every group name
For the example I show, I expect to see:
name percent
0 ag 33.3
1 bgb 50
2 cdd 0
How can I do it ?
Upvotes: 0
Views: 161
Reputation: 862741
Aggregate mean
of boolean mask created by Series.notna
, then multiple by 100
and round if necessary:
df1 = (df['description'].notna()
.groupby(df['name'])
.mean()
.mul(100)
.round(2)
.reset_index(name='percent'))
print (df1)
name percent
0 ag 33.33
1 bgb 50.00
2 cdd 0.00
Alternative with DataFrame.assign
:
df1 = (df.assign(percent=df['description'].notna())
.groupby('name')['percent']
.mean()
.mul(100)
.round(2)
.reset_index())
Upvotes: 3