Reputation: 469
I'm trying to understand how to sum up a subset of rows based on 2 indices in Pandas. The first index groups the rows and the second index will determine which rows to sum. For a minimal example suppose I have the dataframe
index1 | index2 | value
------------------------
a | 1 | 10
a | 2 | 11
a | 3 | 12
b | 1 | 20
b | 2 | 21
b | 3 | 22
How can I sum the rows corresponding to index 1
and 2
but still have them grouped under index a
and b
. I.e the following results.
index1 | index2 | value
------------------------
a | 1 | 21
a | 2 | 12
b | 1 | 41
b | 2 | 22
Here the rows corresponding to index2
of 1
and 2
have been summed into the new index2
of 1
.
Upvotes: 1
Views: 2835
Reputation: 215137
You can derive a new array/Series from index2 column as group variable, and then do the summation:
df.groupby([df.index1, ~df.index2.isin([1,2]) + 1]).value.sum().reset_index()
#index1 index2 value
#0 a 1 21
#1 a 2 12
#2 b 1 41
#3 b 2 22
Upvotes: 3