Reputation: 1331
I am trying to find a straightforward way to change the order of values in a pandas DatafFrame multiindex. To illustrate what I mean, suppose we have a DataFrame with multiindex define as follows:
index = pd.MultiIndex(levels=[[u'C', u'D', u'M'], [u'C', u'D', u'M']],
labels=[[0, 0, 0, 1, 1, 1, 2, 2, 2], [0, 1, 2, 0, 1, 2, 0, 1, 2]],
names=[u'level0', u'level1'])
df = pd.DataFrame(np.random.randint(10,size=(9,3)),index=index,columns=['C','M','D'])
So we have a DataFrame df as follows:
What I am trying to do is to change the sequence of the Multiindex from "C D M" (which is ordered alphabetically) to "C M D" in both level0 and level1. I have tried to use pd.reindex, but have not found an easy way to achieve this goal.
Jezrael gave an answer below which gives the correct display:
L = list('CMD')
mux = pd.MultiIndex.from_product([L, L], names=df.index.names)
df = df.reindex(mux)
print (df)
However, what I need is that the levels of the index are in the order of "C M D" as well. If we check df.index, we get the following:
MultiIndex(levels=[[u'C', u'D', u'M'], [u'C', u'D', u'M']],
labels=[[0, 0, 0, 2, 2, 2, 1, 1, 1], [0, 2, 1, 0, 2, 1, 0, 2, 1]],
names=[u'level0', u'level1'])
Note the "levels" are still in the order of "C D M". What I want is such that when I use df.unstack(), I still get the index in the order of "C M D". Sorry for not making this clear.
Upvotes: 4
Views: 3506
Reputation: 862681
Use reindex
by new MultiIndex.from_product
:
np.random.seed(2018)
index = pd.MultiIndex(levels=[[u'C', u'D', u'M'], [u'C', u'D', u'M']],
labels=[[0, 0, 0, 1, 1, 1, 2, 2, 2], [0, 1, 2, 0, 1, 2, 0, 1, 2]],
names=[u'level0', u'level1'])
df = pd.DataFrame(np.random.randint(10,size=(9,3)),
index=index,columns=['C','M','D'])
print (df)
C M D
level0 level1
C C 6 2 9
D 5 4 6
M 9 9 7
D C 9 6 6
D 1 0 6
M 5 6 7
M C 0 7 8
D 7 9 4
M 8 1 2
L = list('CMD')
mux = pd.MultiIndex.from_product([L, L], names=df.index.names)
df = df.reindex(mux)
print (df)
C M D
level0 level1
C C 6 2 9
M 9 9 7
D 5 4 6
M C 0 7 8
M 8 1 2
D 7 9 4
D C 9 6 6
M 5 6 7
D 1 0 6
EDIT:
If need set ordering create ordered CategoricalIndex and then simply sort_index
:
L = pd.CategoricalIndex(list('CDM'), ordered=True, categories=list('CMD'))
df.index = pd.MultiIndex.from_product([L, L], names=df.index.names)
df = df.sort_index()
print (df)
C M D
level0 level1
C C 6 2 9
M 9 9 7
D 5 4 6
M C 0 7 8
M 8 1 2
D 7 9 4
D C 9 6 6
M 5 6 7
D 1 0 6
Check unstack
for new ordering:
print (df.unstack())
C M D
level1 C M D C M D C M D
level0
C 6 9 5 2 9 4 9 7 6
M 0 8 7 7 1 9 8 2 4
D 9 5 1 6 6 0 6 7 6
Upvotes: 3