edesz
edesz

Reputation: 12406

Pandas DataFrame sort ignoring the case

I have a Pandas dataframe in Python. The contents of the dataframe are from here. I modified the case of the first alphabet in the "Single" column slightly. Here is what I have:

import pandas as pd
df = pd.read_csv('test.csv')
print df

Position                       Artist                  Single               Year     Weeks
       1                Frankie Laine               I Believe               1953  18 weeks
       2                  Bryan Adams         I Do It for You               1991  16 weeks
       3                  Wet Wet Wet      love Is All Around               1994  15 weeks
       4  Drake (feat. Wizkid & Kyla)               One Dance               2016  15 weeks
       5                        Queen       bohemian Rhapsody  1975/76 & 1991/92  14 weeks
       6                 Slim Whitman              Rose Marie               1955  11 weeks
       7              Whitney Houston  i Will Always Love You               1992  10 weeks

I would like to sort by the Single column in ascending order (a to z). When I run

df.sort_values(by='Single',inplace=True)

it seems that the sort is not able to combine upper and lowercase. Here is what I get:

Position                       Artist                  Single               Year     Weeks
       1                Frankie Laine               I Believe               1953  18 weeks
       2                  Bryan Adams         I Do It for You               1991  16 weeks
       4  Drake (feat. Wizkid & Kyla)               One Dance               2016  15 weeks
       6                 Slim Whitman              Rose Marie               1955  11 weeks
       5                        Queen       bohemian Rhapsody  1975/76 & 1991/92  14 weeks
       7              Whitney Houston  i Will Always Love You               1992  10 weeks
       3                  Wet Wet Wet      love Is All Around               1994  15 weeks

So, it is sorting by uppercase first and then performing a separate sort by lower case. I want a combined sort, regardless of the case of the starting alphabet in the Single column. The row with "bohemian Rhapsody" is in the wrong location after sorting. It should be first; instead it is appearing as the 5th row after the sort.

Is there a way to do sort a Pandas DataFrame while ignoring the case of the text in the Single column?

Upvotes: 36

Views: 17255

Answers (4)

RafG
RafG

Reputation: 1506

Pandas 1.1.0 introduced the key argument as a more intuitive way to achieve this:

df.sort_values(by='Single', inplace=True, key=lambda col: col.str.lower())

Upvotes: 43

Sujata Khedkar
Sujata Khedkar

Reputation: 1

make the new column, use it while sorting and delete afterward.

df["Single.Lower"] = df["Name"].str.lower()
df.sort_values(['Single.Lower'], axis=0, ascending=True, inplace=True)
del df["Single.Lower"]

Upvotes: -1

akuiper
akuiper

Reputation: 215057

You can convert all strings to upper/lower case and then call argsort() which gives the index value to reorder the data frame by Single ignoring the case:

df.iloc[df.Single.str.lower().argsort()]

enter image description here

Upvotes: 24

DYZ
DYZ

Reputation: 57085

Create a copy of Single in all upper case letters and sort by that column:

df["Single.Upper"] = df["Single"].str.upper()
df.sort_values(by="Single.Upper", inplace=True)

You can delete the column later:

del df["Single.Upper"] 

Upvotes: 6

Related Questions