Reputation: 2513
I have a data frame with MultiIndex like this:
col_type1 col_type2 col_a col_b colc
col_type1 col_type2 col_x col_y col_x col_y col_x col_y
type11 type21 10 100 11 101 12 102
type12 type22 20 200 21 201 22 202
type13 type23 30 300 31 301 32 302
this is the create data frame code:
pd.DataFrame.from_dict(
{('col_type1', 'col_type1'): {0: 'type11', 1: 'type12', 2: 'type13'},
('col_type2', 'col_type2'): {0: 'type21', 1: 'type22', 2: 'type23'},
('col_a', 'col_x'): {0: '10', 1: '20', 2: '30'},
('col_a', 'col_y'): {0: '100', 1: '200', 2: '300'},
('col_b', 'col_x'): {0: '11', 1: '21', 2: '31'},
('col_b', 'col_y'): {0: '101', 1: '201', 2: '301'},
('col_c', 'col_x'): {0: '12', 1: '22', 2: '32'},
('col_c', 'col_y'): {0: '102', 1: '202', 2: '302'}})
I want to melt this data frame to this format, keep the col_type1
, col_type2
and convert the first level columns to rows:
col_type1 col_type2 col_convert col_x col_y
type11 type21 col_a 10 100
type11 type21 col_b 11 101
type11 type21 col_c 12 102
type12 type22 col_a 20 200
type12 type22 col_b 21 201
type12 type22 col_c 22 202
type13 type23 col_a 30 300
type13 type23 col_b 31 301
type13 type23 col_c 32 302
I tried melt()
, this method can set col_level
.
But when I set it to 0
, it will lose level 1
.
When I set it to 1
it will lose level 0
.
I tried unstack()
, this method can not set something like col_level,
I have to filter the type1
first and drop column col_type
,
Then unstack
the data twice, then append col_type
as type1
,
and the type2
, type3
...
Is there some better way to do this?
Upvotes: 2
Views: 68
Reputation: 153460
Updated:
df = pd.DataFrame.from_dict(
{('col_type1', 'col_type1'): {0: 'type11', 1: 'type12', 2: 'type13'},
('col_type2', 'col_type2'): {0: 'type21', 1: 'type22', 2: 'type23'},
('col_a', 'col_x'): {0: '10', 1: '20', 2: '30'},
('col_a', 'col_y'): {0: '100', 1: '200', 2: '300'},
('col_b', 'col_x'): {0: '11', 1: '21', 2: '31'},
('col_b', 'col_y'): {0: '101', 1: '201', 2: '301'},
('col_c', 'col_x'): {0: '12', 1: '22', 2: '32'},
('col_c', 'col_y'): {0: '102', 1: '202', 2: '302'}})
df.set_index([('col_type1', 'col_type1'),('col_type2', 'col_type2')])\
.stack(0)\
.reset_index()\
.rename(columns={('col_type1', 'col_type1'):'col_type1',
('col_type2', 'col_type2'):'col_type2',
'level_2':'col_convert'})
Output:
col_type1 col_type2 col_convert col_x col_y
0 type11 type21 col_a 10 100
1 type11 type21 col_b 11 101
2 type11 type21 col_c 12 102
3 type12 type22 col_a 20 200
4 type12 type22 col_b 21 201
5 type12 type22 col_c 22 202
6 type13 type23 col_a 30 300
7 type13 type23 col_b 31 301
8 type13 type23 col_c 32 302
Try, stacking level zero of the multiindex column:
df.stack(0).reset_index()
Output:
0 level_1 col_x col_y
0 type1 col_a 10 100
1 type1 col_b 11 101
2 type1 col_c 12 102
3 type2 col_a 20 200
4 type2 col_b 21 201
5 type2 col_c 22 202
6 type3 col_a 30 300
7 type3 col_b 31 301
8 type3 col_c 32 302
Upvotes: 1