Reputation: 209
I have a dataframe which contains a column with strings of the form XXX/XX/XXX. I want to remove all rows for which the length of the string between the '/'s is not equal to two.
I'm getting a "key error: True" with the following code:
df_issues = df_new[len(df_new['Job'].str.split('/')[1]) != 2 ]
My approach was to create a series with all rows for which the string length after the first '/' was not equal to 2.
Thanks for any help.
Upvotes: 2
Views: 763
Reputation: 88236
Some things you have wrong here:
len(x) != 2
will return a boolean. i.e. you're trying to index with df_new[True]
, which returns a key error, since the shapes are not compatible (you want an indexing array along the rows, something like df_new[[True, False, True...]]
)str
accessor again to further index on the second listUse instead:
df_new[df_new['Job'].str.split(r'/').str[1].str.len().eq(2.)]
Or we could also use str.contains
:
# corrected with @jon's remarks
df_new[df_new['Job'].str.contains(r'^.{3}/.{2}/.{3}$',na=False)]
Upvotes: 3