Anuj Gupta
Anuj Gupta

Reputation: 6562

Pandas dataframe: merge rows into 1 row and sum a coulmn

I have a pandas dataframe that contains user id and ad click (if any) by this user across several days

 df =pd.DataFrame([['A',0], ['A',1], ['A',0], ['B',0], ['B',0], ['B',0], ['B',1], ['B',1], ['B',1]],columns=['user_id', 'click_count'])

Out[8]:

   user_id  click_count
0     A       0
1     A       1
2     A       0
3     B       0
4     B       0
5     B       0
6     B       1
7     B       1
8     B       1

I would like to convert this dataframe into A dataframe WITH 1 row per user where 'click_cnt' now is sum of all click_count across all rows for each user in the original dataframe i.e.

Out[18]: 
       user_id  click_cnt  
     0    A          1             
     1    B          3

Upvotes: 0

Views: 36

Answers (1)

PeptideWitch
PeptideWitch

Reputation: 2349

What you're after is the function groupby:

df = df.groupby('user_id', as_index=False).sum()

Adding the flag as_index=False will add the keys as a separate column instead of using them for the new index.

groupby is super useful - have a read through the documentation for more info.

Upvotes: 1

Related Questions