Difference of one multi index level

Question

For a MultiIndex with a repeating level, how can I calculate the differences with another level of the index, effectively ignoring it?

Let me explain in code.

>>> ix = pd.MultiIndex.from_product([(0, 1, 2), (0, 1, 2, 3)])
>>> df = pd.DataFrame([5]*4 + [4]*4 + [3, 2, 1, 0], index=ix)
>>> df
     0
0 0  5
  1  5
  2  5
  3  5
1 0  4
  1  4
  2  4
  3  4
2 0  3
  1  2
  2  1
  3  0

Now by some operation I'd like to subtract the last set of values (2, 0:4) from the whole data frame. I.e. df - df.loc[2] to produce this:

But the statement produces an error. df - df.loc[2:3] does not, but in addition to the trailing zeros only NaNs are produced - naturally of course because the indices don't match.

How could this be achieved?

I realised that the index level is precisely the problem. So I got a bit closer.

>>> df.droplevel(0) - df.loc[2]
   0
0  2
0  1
0  0
1  3
1  2
1  0
2  4
2  3
2  0
3  5
3  4
3  0

Still not quite what I want. But I don't know if there's a convenient way of achieving what I'm after.

Quang Hoang · Accepted Answer

This with stack and unstack:

new_df = df.unstack()
new_df.sub(new_df.loc[2]).stack()

Output:

Difference of one multi index level

Answers (2)

Related Questions