Reputation: 249

Sorting dataframe by specific column names in Pandas

How to sort pandas's dataframe by specific column names? My dataframe columns look like this:

+-------+-------+-----+------+------+----------+
|movieId| title |drama|horror|action|  comedy  |
+-------+-------+-----+------+------+----------+
|                                              |
+-------+-------+-----+------+------+----------+

I would like to sort the dataframe only by columns = ['drama','horror','sci-fi','comedy']. So I get the following dataframe:

+-------+-------+------+------+------+----------+
|movieId| title |action|comedy|drama |  horror  |
+-------+-------+------+------+------+----------+
|                                               |
+-------+-------+------+------+------+----------+

I tried df = df.sort_index(axis=1) but it sorts all columns:

+-------+-------+------+------+-------+----------+
|action | comedy|drama |horror|movieId|  title   |
+-------+-------+------+------+-------+----------+
|                                                |
+-------+-------+------+------+-------+----------+

Upvotes: 2

Answers (3)

AnuragPandey

Reputation: 1

Another way would be set movieId and title as index of the DataFrame and then sort index by the remaining column.

df.set_index(['movieId', 'title'], inplace=True)
df.sort_index(axis=1, inplace=True)

Upvotes: 0

jezrael

Reputation: 862851

Sorting all columns after second column and add first 2 columns:

c = df.columns[:2].tolist() + sorted(df.columns[2:].tolist())
print (c)
['movieId', 'title', 'action', 'comedy', 'drama', 'horror']

Last change order of columns by this list:

df1 = df[c]

Another idea is use DataFrame.sort_index but only for all columns without first 2 selected by DataFrame.iloc:

df.iloc[:, 2:] = df.iloc[:, 2:].sort_index(axis=1)

Upvotes: 1

Jeff

Reputation: 634

You can explicitly rearrange columns like so

df[['movieId','title','drama','horror','sci-fi','comedy']]

If you have a lot of columns to sort alphabetically

df[np.concatenate([['movieId,title'],df.drop('movieId,title',axis=1).columns.sort_values()])]

Upvotes: 1

Sorting dataframe by specific column names in Pandas

Answers (3)

Related Questions