Joe the Second
Joe the Second

Reputation: 362

Python: How to force pandas to sort columns in a dataset?

I have below dataset

df = pd.DataFrame({2002:[None, None, 2, 4, 5],  
                   "Facility":[5, 5, 6, 44, 2], 
                   2003:[None, None, None, 1, 5], 
                   2004 : [ 4,4,3,2,6]})

and I need to sort the columns, in order do so I use the following code

df = df.reindex(sorted(df.columns), axis=1)

however it complains with the following error:

TypeError: '<' not supported between instances of 'str' and 'int'

I know that error appears since one of col names is str type, but how can I solve this problem?

My favorit answer has the sorted columns as below:

'Facility',2002,2003,2004

Upvotes: 2

Views: 202

Answers (2)

zerecees
zerecees

Reputation: 699

Setup: ensure all values that are supposed to be strings are str, and integers are int

1) Get a list of the columns:

my_columns = list(df.columns)

2) Remove "Facility" from the list:

my_columns.remove("Facility")

3) Sort the list of integers:

my_columns.sort()

4) Insert facility into the front of the list:

my_columns.insert(0, "Facility")

5) Reorder the DataFrame with the newly ordered my_columns:

df = df[my_columns]

6) Change columns back to all strings with something like:

df.columns.astype(str)

Upvotes: 0

Sal-laS
Sal-laS

Reputation: 11649

You are almost there.

As you already mentioned, your colnames is a combination of String and int therefore the sort is not successful. So, you can do the following to sort the columns

df.columns.astype(str)

Upvotes: 1

Related Questions