Reputation: 2076
I try to learn how to work with pandas dataframes. My dataframe has 4 columns A,B,C,D.
For index (A,B,C) there are multiple values of D. I want to merge these rows and sum the values of D.
I have:
╔═══╦═══╦═══╦═══╦═══╗
║ ║ A ║ B ║ C ║ D ║
╠═══╬═══╬═══╬═══╬═══╣
║ 1 ║ 1 ║ 2 ║ 3 ║ 5 ║
║ 1 ║ 1 ║ 2 ║ 3 ║ 3 ║
║ 2 ║ 1 ║ 5 ║ 4 ║ 2 ║
║ 2 ║ 1 ║ 2 ║ 4 ║ 2 ║
║ 3 ║ 1 ║ 2 ║ 4 ║ 2 ║
║ 3 ║ 1 ║ 2 ║ 4 ║ 3 ║
╚═══╩═══╩═══╩═══╩═══╝
I want to get:
╔═══╦═══╦═══╦═══╦═══╗
║ ║ A ║ B ║ C ║ D ║
╠═══╬═══╬═══╬═══╬═══╣
║ 1 ║ 1 ║ 2 ║ 3 ║ 8 ║
║ 2 ║ 1 ║ 5 ║ 4 ║ 2 ║
║ 2 ║ 1 ║ 2 ║ 4 ║ 2 ║
║ 3 ║ 1 ║ 2 ║ 4 ║ 5 ║
╚═══╩═══╩═══╩═══╩═══╝
I tried to do it this way:
df=df.groupby(['A','B','C'])['D'].sum()
But it gives me a Series instead.
Upvotes: 1
Views: 1042
Reputation: 393863
If you want to retain the columns after groupby
you can call reset_index
:
In [185]:
df.groupby(['A','B','C'])['D'].sum().reset_index()
Out[185]:
A B C D
0 1 2 3 8
1 1 2 4 7
2 1 5 4 2
or pass arg as_index=False
Upvotes: 1