Reputation: 12406
I have a Pandas dataframe in Python. The contents of the dataframe are from here. I modified the case of the first alphabet in the "Single" column slightly. Here is what I have:
import pandas as pd
df = pd.read_csv('test.csv')
print df
Position Artist Single Year Weeks
1 Frankie Laine I Believe 1953 18 weeks
2 Bryan Adams I Do It for You 1991 16 weeks
3 Wet Wet Wet love Is All Around 1994 15 weeks
4 Drake (feat. Wizkid & Kyla) One Dance 2016 15 weeks
5 Queen bohemian Rhapsody 1975/76 & 1991/92 14 weeks
6 Slim Whitman Rose Marie 1955 11 weeks
7 Whitney Houston i Will Always Love You 1992 10 weeks
I would like to sort by the Single column in ascending order (a to z). When I run
df.sort_values(by='Single',inplace=True)
it seems that the sort is not able to combine upper and lowercase. Here is what I get:
Position Artist Single Year Weeks
1 Frankie Laine I Believe 1953 18 weeks
2 Bryan Adams I Do It for You 1991 16 weeks
4 Drake (feat. Wizkid & Kyla) One Dance 2016 15 weeks
6 Slim Whitman Rose Marie 1955 11 weeks
5 Queen bohemian Rhapsody 1975/76 & 1991/92 14 weeks
7 Whitney Houston i Will Always Love You 1992 10 weeks
3 Wet Wet Wet love Is All Around 1994 15 weeks
So, it is sorting by uppercase first and then performing a separate sort by lower case. I want a combined sort, regardless of the case of the starting alphabet in the Single column. The row with "bohemian Rhapsody" is in the wrong location after sorting. It should be first; instead it is appearing as the 5th row after the sort.
Is there a way to do sort a Pandas DataFrame while ignoring the case of the text in the Single column?
Upvotes: 36
Views: 17255
Reputation: 1506
Pandas 1.1.0 introduced the key
argument as a more intuitive way to achieve this:
df.sort_values(by='Single', inplace=True, key=lambda col: col.str.lower())
Upvotes: 43
Reputation: 1
make the new column, use it while sorting and delete afterward.
df["Single.Lower"] = df["Name"].str.lower()
df.sort_values(['Single.Lower'], axis=0, ascending=True, inplace=True)
del df["Single.Lower"]
Upvotes: -1
Reputation: 215057
You can convert all strings to upper/lower case and then call argsort()
which gives the index value to reorder the data frame by Single ignoring the case:
df.iloc[df.Single.str.lower().argsort()]
Upvotes: 24
Reputation: 57085
Create a copy of Single
in all upper case letters and sort by that column:
df["Single.Upper"] = df["Single"].str.upper()
df.sort_values(by="Single.Upper", inplace=True)
You can delete the column later:
del df["Single.Upper"]
Upvotes: 6