Reputation: 546
I currently have a dataframe as following.
No | ID | Sub_No | Weight |
---|---|---|---|
1 | a23mcsk | 2 | 30 |
2 | smcd302 | 3 | 60 |
3 | a23mcsk | 1 | 24 |
4 | smcd302 | 2 | 45 |
5 | a23mcsk | 3 | 18 |
6 | smcd302 | 1 | 12 |
I want to be able to sort this dataframe first by 'ID' and then by the 'Sub_No'. Is there a way I can do this on Python using Pandas?
Expected Result:
No | ID | Sub_No | Weight |
---|---|---|---|
3 | a23mcsk | 1 | 24 |
1 | a23mcsk | 2 | 30 |
5 | a23mcsk | 3 | 18 |
6 | smcd302 | 1 | 12 |
4 | smcd302 | 2 | 45 |
2 | smcd302 | 3 | 60 |
Upvotes: 0
Views: 857
Reputation: 863031
Use helper column here for correct sorting by numeric values from ID
with Sub_No
:
df['new'] = df['ID'].str.extract('(\d+)').astype(int)
df = df.sort_values(by=['new', 'Sub_No'], ascending=False).drop('new', axis=1)
Another idea with natural sorting:
import natsort as ns
df['new'] = pd.Categorical(df['ID'], ordered=True, categories= ns.natsorted(df['a'].unique()))
df = df.sort_values(by=['new', 'Sub_No'], ascending=False).drop('new', axis=1)
Upvotes: 1