Reputation: 439
I am having trouble understanding the mechanics here given the following.
I have a dataframe
reading from a .csv
:
a1 b1 c1
1 aa bb cc
2 ab ba ca
df.drop(df['a1'].str.contains('aa',case = False))
I want to drop all the rows in column a1 that contain 'aa'
I believe to have attempted everything on here but still get the :
ValueError: labels [False False False ... False False False] not contained in axis
Yes, I have also tried
skipinitialspace=True
axis=1
Any help would be appreciated, thank you.
Upvotes: 2
Views: 120
Reputation: 403128
str.contains
returns a mask:
df['a1'].str.contains('aa',case = False)
1 True
2 False
Name: a1, dtype: bool
However, drop
accepts index labels, not boolean masks. If you open up the help on drop
, you may observe this first-hand:
?df.drop Signature: df.drop(labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise') Docstring: Return new object with labels in requested axis removed. Parameters ---------- labels : single label or list-like Index or column labels to drop.
You could figure out the index labels from the mask and pass those to drop
idx = df.index[df['a1'].str.contains('aa')]
df.drop(idx)
a1 b1 c1
2 ab ba ca
However, this is too windy, so I'd recommend just sticking to the pandaic method of dropping rows based on conditions, boolean indexing:
df[~df['a1'].str.contains('aa')]
a1 b1 c1
2 ab ba ca
If anyone is interested in removing rows that contain strings in a list
df = df[~df['a1'].str.contains('|'.join(my_list))]
Make sure to strip white spaces. Credit to https://stackoverflow.com/a/45681254/9500464
Upvotes: 6