Reputation: 379
Hi all so I have 2 dataframe I am trying to merge and group in a particular order:
df1
LC_REF Category PRDGRP
0 17 1C H Ferrari,Lambo,Merc
1 17 1C M Doritos,Lays,Funyun
2 17 1C P Cats,Dogs,Rabbits
3 16 2C H Aston,Hyundai,Honda
4 16 2C M Cheeto, Vicks
5 16 2C P Rat,Pig,Flamingo
6 17 2C M See,Sea,Far
df2
LC_REF Category PRDGRP
0 17 1C H foo,bar
1 17 1C M foo,bar1
2 16 2C H foo,bar2
3 16 2C M foo,bar3
4 17 2C H foo,bar4
5 17 2C M foo,bar5
6 17 2C P foo,bar6
And I am looking merge them such that all M's of one LC_REF is stacked, then all H's, then all P's, then move onto the second LC_REF. Order does not matter but should be consistent. Hopefully this makes sense:
df3
LC_REF Category PRDGRP
0 17 1C M Doritos,Lays,Funyun
1 17 1C M foo,bar1
2 17 1C H Ferrari,Lambo,Merc
3 17 1C H foo,bar
4 17 1C P Cats,Dogs,Rabbits
5 16 2C M Cheeto, Vicks
6 16 2C M foo,bar3
7 16 2C H Aston,Hyundai,Honda
8 16 2C H foo,bar4
9 17 2C M See,Sea,Far
10 17 2C M foo,bar5
11 17 2C P foo,bar6
I've tried variations of concat and append to no avail:
pd.concat([df1,df2]).sort_index().reset_index(drop=True)
is almost close, but the LC_REF are out of order
Upvotes: 1
Views: 41
Reputation: 153500
Let's use pd.concat
and sort_values
:
df_out = pd.concat([df1,df2])
df_out['Category'] = df_out.Category.astype('category', categories=['M','H','P'], ordered=True)
df_out.sort_values(by=['LC_REF','Category'])
Output:
LC_REF Category PRDGRP
4 16 2C M Cheeto, Vicks
3 16 2C M foo,bar3
3 16 2C H Aston,Hyundai,Honda
2 16 2C H foo,bar2
5 16 2C P Rat,Pig,Flamingo
1 17 1C M Doritos,Lays,Funyun
1 17 1C M foo,bar1
0 17 1C H Ferrari,Lambo,Merc
0 17 1C H foo,bar
2 17 1C P Cats,Dogs,Rabbits
6 17 2C M See,Sea,Far
5 17 2C M foo,bar5
4 17 2C H foo,bar4
6 17 2C P foo,bar6
Upvotes: 2