Reputation: 11
I have a data with 1000 rows and 20 columns, and I would like to group by value some row.
Here an example of what I have :
column A column B column C column D column E
XX YY 25/01/2022 25 50
XX YY 17/01/2023 32 10
I would like group by per column A and column B and sum value of column D and column E
Here an example of what I want :
column A column B column C column D column E
XX YY 25/01/2022 57 60
I kept the inferior date . I would like do it in python with pandas maybe.
Thanks for reading me.
if I need group by only when column A is XX, what could I add for this ?
Upvotes: 1
Views: 43
Reputation: 3260
Use df.groupby()
and aggregation first
for column C
and sum
for columns D
and E
.
data = {'A': ['XX', 'XX'], 'B': ['YY', 'YY'],
'C':['25/01/2022', '17/01/2023'],
'D': [25, 32], 'E': [50, 10]}
df = pd.DataFrame(data)
df[df.A == 'XX'].groupby(['A', 'B'], as_index=False).agg({'C':'first','D':'sum','E':'sum'})
Output:
A B C D E
0 XX YY 25/01/2022 57 60
Upvotes: 1