Reputation: 2751
I would like to take the df below, group unique values together by 'USER','TASK', and 'STATIC_VALUE'. I can do this using groupby(), but I'm having trouble adding a 'TASK_COUNT' and a 'TOTALS' column. The 'TOTALS' column would be multiplying 'STATIC_VALUE' * 'TASK_COUNT'. I've tried a number of variations of groupby(), transform(), size() and I can't get there. Suggestions? Thank you!
Dataframe:
USER TASK STATIC_VALUE
1 USER1 TASK2 30
2 USER2 TASK7 12
3 USER5 TASK4 9
4 USER12 TASK2 30
5 USER2 TASK3 10
6 USER1 TASK2 30
7 USER5 TASK7 12
8 USER1 TASK3 10
9 USER2 TASK3 10
This piece gets me close:
>>> df.groupby(['USER','TASK','STATIC_VALUE']).size()
USER TASK STATIC_VALUE
USER1 TASK2 30 2
TASK3 10 1
USER2 TASK7 12 1
TASK3 10 2
USER5 TASK4 9 1
TASK7 12 1
USER12 TASK2 30 1
Expected Result:
USER TASK STATIC_VALUE TASK_COUNT TOTAL
USER1 TASK2 30 2 60
TASK3 10 1 10
USER2 TASK7 12 1 12
TASK3 10 2 20
USER5 TASK4 9 1 9
TASK7 12 1 12
USER12 TASK2 30 1 30
Upvotes: 1
Views: 32
Reputation: 863451
Use GroupBy.size
:
df1 = df.groupby(['USER','TASK', 'STATIC_VALUE']).size().reset_index(name='TASK_COUNT')
df1['TOTAL'] = df1['TASK_COUNT'] * df1['STATIC_VALUE']
print (df1)
USER TASK STATIC_VALUE TASK_COUNT TOTAL
0 USER1 TASK2 30 2 60
1 USER1 TASK3 10 1 10
2 USER12 TASK2 30 1 30
3 USER2 TASK3 10 2 20
4 USER2 TASK7 12 1 12
5 USER5 TASK4 9 1 9
6 USER5 TASK7 12 1 12
Upvotes: 2