Pandas: Get percent of numbe of elements which are not null after grouping by other column

Question

Assuming that I have a dataframe with the following values:

    name     start    end     description
0    ag       20       30       None
1    bgb      21       111      'a'
2    cdd      31       101      None
3    bgb      17       19       None
4    ag       20       22       None
5    ag       1        65       'avc'

I want to groupby name and then get percent of number of description which is not null for every group name

For the example I show, I expect to see:

    name     percent 
0    ag       33.3      
1    bgb      50
2    cdd      0

How can I do it ?

jezrael · Accepted Answer

Aggregate mean of boolean mask created by Series.notna, then multiple by 100 and round if necessary:

df1 = (df['description'].notna()
                        .groupby(df['name'])
                        .mean()
                        .mul(100)
                        .round(2)
                        .reset_index(name='percent'))
print (df1)
  name  percent
0   ag    33.33
1  bgb    50.00
2  cdd     0.00

Alternative with DataFrame.assign:

df1 = (df.assign(percent=df['description'].notna())
         .groupby('name')['percent']
         .mean()
         .mul(100)
         .round(2)
         .reset_index())

Pandas: Get percent of numbe of elements which are not null after grouping by other column

Answers (1)

Related Questions