Kevin Nash
Kevin Nash

Reputation: 1561

Sorting pandas column based on number in the suffix after underscore

I have a Dataframe with the below set of columns:

bill_id, product_1, product_20, product_300, price_1, price_20, price_300, quantity_1, quantity_20, quantity_300

I would like this to be sorted in the below sequence based on the number after the underscore at the end of each column label

bill_id, product_1, price_1, quantity_1, product_20, price_20, quantity_20, product_300, price_300, quantity_300

Upvotes: 1

Views: 339

Answers (1)

jezrael
jezrael

Reputation: 863246

Use sorted with lambda function by number after _ by all columns without first and then change order by DataFrame.reindex:

c = 'bill_id, product_1, product_20, product_300, price_1, price_20, price_300, quantity_1, quantity_20, quantity_300'

df = pd.DataFrame(columns=c.split(', '))
print (df)
Empty DataFrame
Columns: [bill_id, product_1, product_20, product_300, 
          price_1, price_20, price_300, quantity_1, quantity_20, quantity_300]
Index: []

c = sorted(df.columns[1:], key=lambda x: int(x.split('_')[-1]))
print (c)
['product_1', 'price_1', 'quantity_1', 
 'product_20', 'price_20', 'quantity_20', 
 'product_300', 'price_300', 'quantity_300']

df = df.reindex(df.columns[:1].tolist() + c, axis=1)
print (df)
Columns: [bill_id, product_1, price_1, quantity_1, 
         product_20, price_20, quantity_20, 
         product_300, price_300, quantity_300]
Index: []

Another idea is create index by all non product columns and sorting by all columns:

df = df.set_index('bill_id')
c = sorted(df.columns, key=lambda x: int(x.split('_')[-1]))

df = df.reindex(c, axis=1)

Upvotes: 2

Related Questions