Reputation: 6562
I have a pandas dataframe that contains user id and ad click (if any) by this user across several days
df =pd.DataFrame([['A',0], ['A',1], ['A',0], ['B',0], ['B',0], ['B',0], ['B',1], ['B',1], ['B',1]],columns=['user_id', 'click_count'])
Out[8]:
user_id click_count
0 A 0
1 A 1
2 A 0
3 B 0
4 B 0
5 B 0
6 B 1
7 B 1
8 B 1
I would like to convert this dataframe into A dataframe WITH 1 row per user where 'click_cnt' now is sum of all click_count across all rows for each user in the original dataframe i.e.
Out[18]:
user_id click_cnt
0 A 1
1 B 3
Upvotes: 0
Views: 36
Reputation: 2349
What you're after is the function groupby
:
df = df.groupby('user_id', as_index=False).sum()
Adding the flag as_index=False
will add the keys as a separate column instead of using them for the new index.
groupby
is super useful - have a read through the documentation for more info.
Upvotes: 1