Ashutosh Sharma
Ashutosh Sharma

Reputation: 1

sorting a pandas Series not working correctly

I am trying to sort a given series in python pandas but as per my knowledge it is not correct , it should be like [1,3,5,10,python]

can you please guide on what basis it is sorting this way ?

s1 = pd.Series(['1','3','python','10','5'])

s1.sort_values(ascending=True)

enter image description here

Upvotes: 0

Views: 269

Answers (1)

mozway
mozway

Reputation: 261860

As explained in the comments, you have strings so '5' is greater than '10' (strings are compared character by character and '5' > '1').

One workaround is to use natsort for natural sorting:

from natsort import natsort_key

s1.sort_values(ascending=True, key=natsort_key)

output:

0         1
1         3
4         5
3        10
2    python
dtype: object

alternative without natsort (numbers first, strings after):

key = lambda s: (pd.concat([pd.to_numeric(s, errors='coerce')
                              .fillna(float('inf')), s], axis=1)
                   .agg(tuple, axis=1)
                 )
s1.sort_values(ascending=True, key=key)

Upvotes: 3

Related Questions