Nvouk
Nvouk

Reputation: 27

Sort dataframe by multiple columns while ignoring case

I want to sort a dataframe by multiple columns like this:

df.sort_values( by=[ 'A', 'B', 'C', 'D', 'E' ], inplace=True )

However i found out that python first sorts the uppercase values and then the lowercase.

I tried this:

df.sort_values( by=[ 'A', 'B', 'C', 'D', 'E' ], inplace=True, key=lambda x: x.str.lower() )

but i get this error:

TypeError: sort_values() got an unexpected keyword argument 'key'

If i could, i would turn all columns to lowercase but i want them as they are.

Any hints?

Upvotes: 1

Views: 1139

Answers (1)

jezrael
jezrael

Reputation: 863226

If check docs - DataFrame.sort_values for correct working need upgrade pandas higher like pandas 1.1.0:

key - callable, optional

Apply the key function to the values before sorting. This is similar to the key argument in the builtin sorted() function, with the notable difference that this key function should be vectorized. It should expect a Series and return a Series with the same shape as the input. It will be applied to each column in by independently.

New in version 1.1.0.

Sample:

df = pd.DataFrame({
        'A':list('MmMJJj'),
        'B':list('aYAbCc')
})
df.sort_values(by=[ 'A', 'B'], inplace=True, key=lambda x: x.str.lower())
print (df)
   A  B
3  J  b
4  J  C
5  j  c
0  M  a
2  M  A
1  m  Y

Upvotes: 3

Related Questions