Diziet Asahi
Diziet Asahi

Reputation: 40697

ordering the MultiIndex columns after pivot_table

I have the following dataframe:

df = pd.DataFrame({'mean':np.arange(0,12),
                   'lo':np.arange(12,24),
                   'hi':np.arange(24,36),
                   'prop':['prop1']*6+['prop2']*6,
                   'group':['group1','group2','group3']*4,
                   'type':['type1','type2']*6})'effect_size'
df.head()

output:

  mean  lo  hi  prop    group   type
0   0   12  24  prop1   group1  type1
1   1   13  25  prop1   group2  type2
2   2   14  26  prop1   group3  type1
3   3   15  27  prop1   group1  type2
4   4   16  28  prop1   group2  type1

I want to pivot the table so that the resulting table has these columns (desired output):

      type1                                type2
      group1       group2      group3      group1      group2      group3
      mean  lo  hi mean lo  hi mean lo  hi mean lo  hi mean lo  hi mean lo  hi
prop1   0   12  24  4   16  28  2   14  26  3   15  27  1   13  25  5   17  29
prop2   6   18  30  10  22  34  8   20  32  9   21  33  7   19  31  11  23  35

I have tried using pivot() (doesn't work because of duplicated index values), and pivot_table():

df2 = df.pivot_table(index='prop',columns=['group','type'], values=['mean','lo','hi'])
df2

output:

    hi  lo  mean
group   group1  group2  group3  group1  group2  group3  group1  group2  group3
type    type1   type2   type1   type2   type1   type2   type1   type2   type1   type2   type1   type2   type1   type2   type1   type2   type1   type2
prop                                                                        
prop1   24  27  28  25  26  29  12  15  16  13  14  17  0   3   4   1   2   5
prop2   30  33  34  31  32  35  18  21  22  19  20  23  6   9   10  7   8   11

The result is what I want, except that the order of the columns is incorrect and difficult to read. How can I reorder the columns of df2 to get my desired output?

I have tried reindex() with a MultiIndex index, but I end up with a table filled with NaNs.

Upvotes: 1

Views: 1187

Answers (1)

Andy L.
Andy L.

Reputation: 25239

pandas pivoting table always sorts index and column names. You need post-processing the output if you want as your desired output.

Use swaplevel, sort_index and reindex on the output of pivot_table

df3 = (df2.swaplevel(0,2, axis=1).sort_index(1)
                                 .reindex(['mean','lo','hi'], level=2, axis=1))

Out[107]:
type   type1                                        type2                     \
group group1         group2         group3         group1         group2
        mean  lo  hi   mean  lo  hi   mean  lo  hi   mean  lo  hi   mean  lo
prop
prop1      0  12  24      4  16  28      2  14  26      3  15  27      1  13
prop2      6  18  30     10  22  34      8  20  32      9  21  33      7  19

type
group     group3
       hi   mean  lo  hi
prop
prop1  25      5  17  29
prop2  31     11  23  35

Upvotes: 3

Related Questions