Reputation: 2062
Let's say I have a simple dataframe which has an index of a datetime and three columns - one being a value and one being an aggregating indicator and the last being a unique identifier. The index is not unique across the dataframe, as it is duplicated over multiple instances of the indicator. ie it looks like this:
index, val, aggregating indicator, unique_ref
1-Jan, 1, set_a,r1
2-Jan, 2, set_a,r2
1-Jan, 3, set_a,r3
2-Jan, 4, set_a,r4
1-Jan, 5, set_b,r5
2-Jan, 6, set_b,r6
Is it possible to aggregate (in one step) the val and group by the index as well as the aggregating_indicator ?
ie the result I would like is
index, sum, aggregating_indicator
1-Jan, 4, set_a // ie 1+3
2-Jan, 6, set_a // ie 2+4
1-Jan, 5, set_b
2-Jan, 6, set_b
Doing it in two steps is ok... But I was wondering if it was possible to do it in one neat way.
Upvotes: 1
Views: 70
Reputation: 117485
You can do it like this:
>>> df.reset_index().groupby(['index','aggregating indicator'])['val'].sum().reset_index()
index aggregating indicator val
0 1-Jan set_a 4
1 1-Jan set_b 5
2 2-Jan set_a 6
3 2-Jan set_b 6
Upvotes: 2