VSP
VSP

Reputation: 379

How can I sort a column of strings in pandas dataframe where I force the order of the letters the column is sorted by?

I have a dataframe in python in which one of the columns contains the following letters D W M Y which stand for day, week, month,year. I want to sort my dataframe based on this column in the order I have given it. However the only option I could find was sorting based on alphabetical order which is not what I need. Could you please advise me if this is possible.

The dataframe below is an example of what I am trying to sort. I need it sorted first based on id, then unit where the unit goes D,W,M,Y and then by time. I tried using the regular sort function but it is not working as it is sorting the unit based on alphabetical order.

dataframe

id  time  unit     
1   3     M
2   5     Y
1   12    D
8   6     W
6   1     M
1   2     W

dataframe.sort_values(by=[id,unit,time])

Upvotes: 5

Views: 2819

Answers (2)

jhansen
jhansen

Reputation: 1136

Alternatively, you could create a separate column in your DataFrame:

df['unit_index'] = df['unit'].apply(list('DWMY').index)
df.sort_values(['id','unit_index','time'])

   id  time unit  unit_index
2   1    12    D           0
5   1     2    W           1
0   1     3    M           2
1   2     5    Y           3
4   6     1    M           2
3   8     6    W           1

Upvotes: 0

yatu
yatu

Reputation: 88226

You can define the unit column as a pd.Categorical and then sort the columns.

By doing so, the ordered categoricals can be sorted according to the custom order of the categories and can have a min and max value. This can be done by setting a categories list, which will specify the order to follow when sorting the dataframe. Then you can simply sort the dataframe as you where doing yourself with sort_values:

df['unit'] = pd.Categorical(df.unit.tolist(), categories=list('DWMY'))
df.sort_values(['id','unit','time'])

    id  time unit
2   1    12    D
5   1     2    W
0   1     3    M
1   2     5    Y
4   6     1    M
3   8     6    W

Upvotes: 6

Related Questions