Reputation: 1
I am trying to sort a given series in python pandas but as per my knowledge it is not correct , it should be like [1,3,5,10,python]
can you please guide on what basis it is sorting this way ?
s1 = pd.Series(['1','3','python','10','5'])
s1.sort_values(ascending=True)
Upvotes: 0
Views: 269
Reputation: 261860
As explained in the comments, you have strings so '5'
is greater than '10'
(strings are compared character by character and '5' > '1'
).
One workaround is to use natsort
for natural sorting:
from natsort import natsort_key
s1.sort_values(ascending=True, key=natsort_key)
output:
0 1
1 3
4 5
3 10
2 python
dtype: object
alternative without natsort
(numbers first, strings after):
key = lambda s: (pd.concat([pd.to_numeric(s, errors='coerce')
.fillna(float('inf')), s], axis=1)
.agg(tuple, axis=1)
)
s1.sort_values(ascending=True, key=key)
Upvotes: 3