Reputation: 739
I'm trying to group a column 'Over_Id' in Dataframe and also sum values of column runs_scored
while grouping.
If I use groupby
, I loose my other columns
Eg:
ball.groupby(['Match_Id','Innings_Id','Over_Id'])['runs_scored'].sum()
I was able to get my runs_scored
column, but in a new Dataframe, not my actual as seen in the image. I can't merge
, because my addition of runs_scored
column is based on 3 columns.
In short, I want only 1 entry for each Over_Id
and it's corresponding runs_scored
.
How can I do that?
Upvotes: 1
Views: 1939
Reputation: 402263
You could just group by every column besides the runs_scored
column, and then find the sum.
c = df.columns.difference(['runs_scored']).tolist()
df = df.groupby(c, as_index=False).runs_scored.sum()
On a side note, it seems you have a lot of redundant data entries. Have you looked at normalising your tables?
Upvotes: 3