Umar.H
Umar.H

Reputation: 23099

Re-ordering df.columns alpha numerically

I have a df as follows:

        Store   Spend_1 Spend_2 Spend_3 Spend_4 Variance_1  Variance_2  Variance_3  Variance_4
0   1   200 230 189 200 -14 16  -6  18
1   2   224 104 240 203 -17 -11 17  -18
2   3   220 168 131 210 10  -9  12  19
3   4   143 187 139 188 -1  -17 -20 -9
4   5   179 121 162 131 6   -25 5   20
5   6   208 158 140 191 16  -14 -22 -6

I'm attempting to apply a custom sort on the column names to order it as so :

    Store   Spend_1 Variance_1  Spend_2 Variance_2  Spend_3 Variance_3  Spend_4 Variance_4
0   1   200 -14 230 16  189 -6  200 18
1   2   224 -17 104 -11 240 17  203 -18
2   3   220 10  168 -9  131 12  210 19
3   4   143 -1  187 -17 139 -20 188 -9
4   5   179 6   121 -25 162 5   131 20
5   6   208 16  158 -14 140 -22 191 -6

I've tried the simple sorted but obviously this applies alphabetically, ignoring the integer at the end.

I've toyed around with enumerating as number, cols the df.columns changing the strings to ints, applying a sort then using the numbers in the iloc but I'm not sure how apply a custom sort that way.

Upvotes: 2

Views: 68

Answers (4)

yatu
yatu

Reputation: 88295

Here's one approach splitting the columns on _, reversing the resulting lists so that further sorting prioritises the trailing digits and using pandas.Index.argsort:

df.iloc[:,[0]+[*df.columns.str.split('_').str[::-1].argsort()[:-1]]]

     Store  Spend_1  Variance_1  Spend_2  Variance_2  Spend_3  Variance_3  \
0      1      200         -14      230          16      189          -6   
1      2      224         -17      104         -11      240          17   
2      3      220          10      168          -9      131          12   
3      4      143          -1      187         -17      139         -20   
4      5      179           6      121         -25      162           5   
5      6      208          16      158         -14      140         -22   

     Spend_4  Variance_4  
0      200          18  
1      203         -18  
2      210          19  
3      188          -9  
4      131          20  
5      191          -6  

Upvotes: 1

jezrael
jezrael

Reputation: 863351

Idea is use key parameetr by 2 values - values after _ converted to inetegrs with first values before _, but solution is apply for all columns without first with df.columns[1:], so last is added first column by df.columns[:1].tolist():

cols = df.columns[:1].tolist() +sorted(df.columns[1:], 
                                       key=lambda x: (int(x.split('_')[1]), x.split('_')[0]))

df = df[cols]
print (df)
   Store  Spend_1  Variance_1  Spend_2  Variance_2  Spend_3  Variance_3  \
0      1      200         -14      230          16      189          -6   
1      2      224         -17      104         -11      240          17   
2      3      220          10      168          -9      131          12   
3      4      143          -1      187         -17      139         -20   
4      5      179           6      121         -25      162           5   
5      6      208          16      158         -14      140         -22   

   Spend_4  Variance_4  
0      200          18  
1      203         -18  
2      210          19  
3      188          -9  
4      131          20  
5      191          -6  

Upvotes: 3

Rotem Tal
Rotem Tal

Reputation: 769

simplest way i can think of is defining your own key to the sort

df = df.reindex(sorted(df.columns, key=lambda x: int(x.split("_")[1]) if "_" in x else 0), axis=1) 

Upvotes: 1

Dan
Dan

Reputation: 45752

You can pass a keyt to sorted to do your own custom sorting:

sorted_columns = sorted(df.columns, key = lambda col: col[-1] + col[:-1])
df[sorted_columns]

The ideas is to put the final integer first. This will breakdown if you can go into double digits.

Upvotes: 1

Related Questions