fish
fish

Reputation: 2513

Pandas, convert MultiIndex data frame some columns into rows

I have a data frame with MultiIndex like this:

col_type1  col_type2   col_a           col_b           colc
col_type1  col_type2   col_x  col_y    col_x col_y    col_x  col_y
type11     type21      10     100      11    101      12     102
type12     type22      20     200      21    201      22     202
type13     type23      30     300      31    301      32     302

this is the create data frame code:

pd.DataFrame.from_dict(
{('col_type1', 'col_type1'): {0: 'type11', 1: 'type12', 2: 'type13'},
 ('col_type2', 'col_type2'): {0: 'type21', 1: 'type22', 2: 'type23'},
 ('col_a', 'col_x'): {0: '10', 1: '20', 2: '30'},
 ('col_a', 'col_y'): {0: '100', 1: '200', 2: '300'},
 ('col_b', 'col_x'): {0: '11', 1: '21', 2: '31'},
 ('col_b', 'col_y'): {0: '101', 1: '201', 2: '301'},
 ('col_c', 'col_x'): {0: '12', 1: '22', 2: '32'},
 ('col_c', 'col_y'): {0: '102', 1: '202', 2: '302'}})

I want to melt this data frame to this format, keep the col_type1, col_type2 and convert the first level columns to rows:

col_type1 col_type2  col_convert col_x  col_y
type11    type21     col_a       10     100  
type11    type21     col_b       11     101
type11    type21     col_c       12     102
type12    type22     col_a       20     200  
type12    type22     col_b       21     201
type12    type22     col_c       22     202
type13    type23     col_a       30     300  
type13    type23     col_b       31     301
type13    type23     col_c       32     302

I tried melt(), this method can set col_level.

But when I set it to 0, it will lose level 1.

When I set it to 1 it will lose level 0.

I tried unstack(), this method can not set something like col_level,

I have to filter the type1 first and drop column col_type,

Then unstack the data twice, then append col_type as type1,

and the type2, type3...

Is there some better way to do this?

Upvotes: 2

Views: 68

Answers (1)

Scott Boston
Scott Boston

Reputation: 153460

Updated:

df = pd.DataFrame.from_dict(
{('col_type1', 'col_type1'): {0: 'type11', 1: 'type12', 2: 'type13'},
 ('col_type2', 'col_type2'): {0: 'type21', 1: 'type22', 2: 'type23'},
 ('col_a', 'col_x'): {0: '10', 1: '20', 2: '30'},
 ('col_a', 'col_y'): {0: '100', 1: '200', 2: '300'},
 ('col_b', 'col_x'): {0: '11', 1: '21', 2: '31'},
 ('col_b', 'col_y'): {0: '101', 1: '201', 2: '301'},
 ('col_c', 'col_x'): {0: '12', 1: '22', 2: '32'},
 ('col_c', 'col_y'): {0: '102', 1: '202', 2: '302'}})

df.set_index([('col_type1', 'col_type1'),('col_type2', 'col_type2')])\
  .stack(0)\
  .reset_index()\
  .rename(columns={('col_type1', 'col_type1'):'col_type1',
                   ('col_type2', 'col_type2'):'col_type2',
                   'level_2':'col_convert'})

Output:

  col_type1 col_type2 col_convert col_x col_y
0    type11    type21       col_a    10   100
1    type11    type21       col_b    11   101
2    type11    type21       col_c    12   102
3    type12    type22       col_a    20   200
4    type12    type22       col_b    21   201
5    type12    type22       col_c    22   202
6    type13    type23       col_a    30   300
7    type13    type23       col_b    31   301
8    type13    type23       col_c    32   302

Try, stacking level zero of the multiindex column:

df.stack(0).reset_index()

Output:

       0 level_1  col_x  col_y
0  type1   col_a     10    100
1  type1   col_b     11    101
2  type1   col_c     12    102
3  type2   col_a     20    200
4  type2   col_b     21    201
5  type2   col_c     22    202
6  type3   col_a     30    300
7  type3   col_b     31    301
8  type3   col_c     32    302

Upvotes: 1

Related Questions