Reputation: 8628
I have the following dataframe:
df =
GROUP TOTAL_SERVICE_TIME TOTAL_WAIT_TIME IS_EVALUATED IS_NEGATIVE_GRADE
AAA 19 60 0 0
AAA 248 84 1 0
AAA 135 62 1 1
BBB 97 36 1 1
BBB 395 117 0 0
I am grouping the data as follows (by GROUP
and TOTAL_WAIT_TIME
):
funcs = {
'TOTAL_SERVICE_TIME': {'TOTAL_SERVICE_TIME':'mean'},
'IS_EVALUATED' : {'IS_EVALUATED':'size'},
'IS_NEGATIVE_GRADE' : {'IS_NEGATIVE_GRADE':'size'},
}
fresult = result.groupby(['GROUP','TOTAL_WAIT_TIME']).agg(funcs)
fresult.columns = fresult.columns.droplevel(0)
fresult = fresult.reset_index()
fresult
The problem is that IS_EVALUATED
and IS_NEGATIVE_GRADE
are calculated incorrectly. I want to count only the values of 1
in these columns, but not all the rows.
Upvotes: 1
Views: 37
Reputation: 210912
In [202]: funcs = {
...: 'TOTAL_SERVICE_TIME': 'mean',
...: 'IS_EVALUATED' : 'sum',
...: 'IS_NEGATIVE_GRADE' : 'sum',
...: }
...:
...: fresult = result.groupby(['GROUP','TOTAL_WAIT_TIME'], as_index=False).agg(funcs)
...:
In [203]: fresult
Out[203]:
GROUP TOTAL_WAIT_TIME IS_EVALUATED IS_NEGATIVE_GRADE TOTAL_SERVICE_TIME
0 AAA 60 0 0 19
1 AAA 62 1 1 135
2 AAA 84 1 0 248
3 BBB 36 1 1 97
4 BBB 117 0 0 395
Upvotes: 1