andbeonetraveler
andbeonetraveler

Reputation: 725

Pandas sort dataframe by column with strings and integers

I have a dataframe with a column containing both integers and strings:

>>> df = pd.DataFrame({'a':[2,'c',1,10], 'b':[5,4,0,6]})
>>> df
    a  b
0   2  5
1   c  4
2   1  0
3  10  6

I want to sort the dataframe by column a, treating the strings and integers separately, with strings first:

>>> df
    a  b
1   c  4
2   1  0
0   2  5
3  10  6

...but Python doesn't allow comparing integers to strings.

TypeError: unorderable types: int() > str()

If I first convert all the integers to strings, I don't get what I want:

>>> df.a = df.a.astype(str)
>>> df.sort(columns='a')
    a  b
0   1  0
3  10  6
2   2  5
1   c  4

Does anyone know of a one-line way to tell Pandas that I want it to sort strings first, then integers, without first breaking the dataframe into pieces?

Upvotes: 0

Views: 5563

Answers (1)

akuiper
akuiper

Reputation: 215047

One option would be to group the data frame by the data type of column a and then sort each group separately:

df.groupby(df.a.apply(type) != str).apply(lambda g: g.sort_values('a')).reset_index(drop = True)

enter image description here

Upvotes: 3

Related Questions