Summing rows based on conditional in Pandas

Question

I'm trying to understand how to sum up a subset of rows based on 2 indices in Pandas. The first index groups the rows and the second index will determine which rows to sum. For a minimal example suppose I have the dataframe

index1 | index2 | value
------------------------
  a    |    1   |  10
  a    |    2   |  11
  a    |    3   |  12
  b    |    1   |  20
  b    |    2   |  21
  b    |    3   |  22

How can I sum the rows corresponding to index 1 and 2 but still have them grouped under index a and b. I.e the following results.

index1 | index2 | value
------------------------
  a    |    1   |  21
  a    |    2   |  12
  b    |    1   |  41
  b    |    2   |  22

Here the rows corresponding to index2 of 1 and 2 have been summed into the new index2 of 1.

akuiper · Accepted Answer

You can derive a new array/Series from index2 column as group variable, and then do the summation:

df.groupby([df.index1, ~df.index2.isin([1,2]) + 1]).value.sum().reset_index()

#index1 index2  value
#0    a      1     21
#1    a      2     12
#2    b      1     41
#3    b      2     22

Summing rows based on conditional in Pandas

Answers (1)

Related Questions