count and countif in pandas dataframe

Question

I have a DF that looks like this:

trainee | course | completed | days overdue
Ava       ABC      Yes         0
Bob       ABC      Yes         1
Charlie   DEF      No          10
David     DEF      Yes         0
Emily     DEF      Yes         0
Finn      GHI      Yes         0

I need to create a DF that tells me how many courses were taken, how many were taken in time (i.e. with 0 overdue days), and what's the percentage of completeness.

I.e. the result should look like this:

course | count | in time | % completed in time
ABC      2       1         0.5
DEF      3       2         0.66
GHI      1       1         1

How can I do it with Pandas?

Thanks!

FP

p.s. here's the code to generate the input DF:

df = pd.DataFrame({'Trainee': ['Ava','Bob','Charlie','David','Emily','Finn'],'Course':['ABC','ABC','DEF','DEF','DEF','GHI'],'Completed': ['Yes','Yes','No','Yes','Yes','Yes'],'Days overdue':[0,1,10,0,0,0]})

jezrael · Accepted Answer

Use agg fot total and count 0s per groups, last divide columns by div:

tups = [('count', 'size'), ('in time', lambda x: (x==0).sum())]
df = df.groupby('course')['days overdue'].agg(tups).reset_index()

df['% completed in time'] = df['in time'].div(df['count'])
print (df)
  course  count  in time  % completed in time
0    ABC      2        1             0.500000
1    DEF      3        2             0.666667
2    GHI      1        1             1.000000

count and countif in pandas dataframe

Answers (2)

Related Questions