Reputation:
This is the look of my DataFrame
:
StateAb GivenNm Surname PartyNm PartyAb ElectedOrder 35 WA Joe BULLOCK Australian Labor Party ALP 2 36 WA Michaelia CASH Liberal LP 3 37 WA Linda REYNOLDS Liberal LP 4 38 WA Wayne DROPULICH Australian Sports Party SPRT 5 39 WA Scott LUDLAM The Greens (WA) GRN 6
and I want to list a list of senators whose surname is more than 9 characters long.
So I think the code should be like this:
df[len(df.Surname) > 9]
but this raises a KeyError
, where did I go wrong?
Upvotes: 10
Views: 52587
Reputation:
The correct way to filter a DataFrame based on the length of strings in a column is
df[df['Surname'].str.len() > 9]
df['Surname'].str.len()
creates a Series of lengths for the surname column and df[df['Surname'].str.len() > 9]
filters out the ones less than or equal to 9. What you did is to check the length of the Series itself (how many rows it has).
Upvotes: 22
Reputation: 11
Have a look at the python filter function. It does exactly what you want.
df = [
{"Surname": "Bullock-ish"},
{"Surname": "Cash"},
{"Surname": "Reynolds"},
]
longnames = list(filter(lambda s: len(s["Surname"]) > 9, df))
print(longnames)
>>[{'Surname': 'Bullock-ish'}]
Sytse
Upvotes: 1