Effe Pelosa
Effe Pelosa

Reputation: 173

count and countif in pandas dataframe

I have a DF that looks like this:

trainee | course | completed | days overdue
Ava       ABC      Yes         0
Bob       ABC      Yes         1
Charlie   DEF      No          10
David     DEF      Yes         0
Emily     DEF      Yes         0
Finn      GHI      Yes         0

I need to create a DF that tells me how many courses were taken, how many were taken in time (i.e. with 0 overdue days), and what's the percentage of completeness.

I.e. the result should look like this:

course | count | in time | % completed in time
ABC      2       1         0.5
DEF      3       2         0.66
GHI      1       1         1

How can I do it with Pandas?

Thanks!

FP

p.s. here's the code to generate the input DF:

df = pd.DataFrame({'Trainee': ['Ava','Bob','Charlie','David','Emily','Finn'],'Course':['ABC','ABC','DEF','DEF','DEF','GHI'],'Completed': ['Yes','Yes','No','Yes','Yes','Yes'],'Days overdue':[0,1,10,0,0,0]})

Upvotes: 3

Views: 5221

Answers (2)

sammywemmy
sammywemmy

Reputation: 28644

one option, which offers a modest speed up, is to build an array of 1s and 0s for the days overdue, before grouping:

temp = df.assign(d = np.where(df['Days overdue'] == 0, 1, 0)).groupby('Course')

count = temp.Trainee.size()

in_time = temp.d.sum()

completed_in_time = in_time/count

pd.DataFrame({'count':count, 
              'in time' : in_time, 
              '% completed in time' : completed_in_time
              })

        count  in time  % completed in time
Course
ABC         2        1             0.500000
DEF         3        2             0.666667
GHI         1        1             1.000000

Upvotes: 0

jezrael
jezrael

Reputation: 862511

Use agg fot total and count 0s per groups, last divide columns by div:

tups = [('count', 'size'), ('in time', lambda x: (x==0).sum())]
df = df.groupby('course')['days overdue'].agg(tups).reset_index()

df['% completed in time'] = df['in time'].div(df['count'])
print (df)
  course  count  in time  % completed in time
0    ABC      2        1             0.500000
1    DEF      3        2             0.666667
2    GHI      1        1             1.000000

Upvotes: 5

Related Questions