Nicoleue
Nicoleue

Reputation: 65

Pandas df.reindex() doesn't work for level>0 for MultiIndex

I've got two DataFrames A and B, both with a MultiIndex in the columns. B however has the order of the columns switched up. Now I would simply like to get the columns of B in the same order as in A.

The clue is, the order depends on the second level of the MultiIndex. Pandas' reindex() should do just that, but it does only work for me at level 0 as shown in the following

df = pd.DataFrame([[1,2,3],[4,5,6]])
df.columns = pd.MultiIndex.from_arrays([["a","b","c"], ["aa","bb","cc"]])
print(df)

outputs

    a   b   c
    aa  bb  cc
0   1   2   3
1   4   5   6

First level of the MultiIndex:

df = df.reindex(columns=["b","a","c"], level=0)
print(df)

    b   a   c
    bb  aa  cc
0   2   1   3
1   5   4   6

Second level:

df = df.reindex(columns=["bb","aa","cc"], level=1)
df
    a   b   c
    aa  bb  cc
0   1   2   3
1   4   5   6

The function doesn't raise an exception. And I've also tried it with columns=["bb","aa","cc"], level=2 or columns=["bb","aa","cc"], level=0. There it outputs an empty DataFrame, so my implementation seems to be correct, but just nothing happens...

The next best workaround I can think of is changing level 1 to level 0, but that shouldn't be necessary. Looking forward to your ideas!

Upvotes: 1

Views: 136

Answers (2)

nisakova
nisakova

Reputation: 622

assign back to df

df = df.reindex(columns=["b","a","c"], level=0)
df = df.reindex(columns=["bb","aa","cc"], level=1)
print(df)

   b  a  c
  bb aa cc
0  2  1  3
1  5  4  6

Upvotes: 0

mozway
mozway

Reputation: 262484

If you want to reindex both levels, just pass your MultiIndex:

A = pd.DataFrame([[1,2,3],[4,5,6]])
A.columns = pd.MultiIndex.from_arrays([["a","b","c"], ["aa","bb","cc"]])

B = pd.DataFrame(columns=pd.MultiIndex.from_arrays([["b","a","c"],
                                                    ["bb","aa","cc"]]),
                 index=[0])

out = A.reindex(columns=B.columns)

Or maybe you just want to sort_index:

order = ["bb","aa","cc"]
out = A.sort_index(level=1, axis=1,
                   key=pd.Series({v:k for k,v in enumerate(order)}).get)

Output:

   b  a  c
  bb aa cc
0  2  1  3
1  5  4  6

Upvotes: 4

Related Questions