Reputation: 31
I want to delete rows who have certain values. The values that I want to delete have a "+" and are as follows:
cooperative+parallel
passive+prosocial
My dataset consists of 900000 rows, and about 2000 values contain the problem I mentioned.
I want the code something like this:
df = df[df.columnname != '+']
The above is for one column (its not working well) but I would also like one example for whole dataset.
I prefer the solution in Pandas.
Many thanks
Upvotes: 1
Views: 331
Reputation: 862601
Use Series.str.contains
with invert mask by ~
and escape +
, because special regex character with DataFrame.apply
for all object columns selected by DataFrame.select_dtypes
with DataFrame.any
for test at least one match:
df1 = df[~df.select_dtypes(object).apply(lambda x: x.str.contains('\+')).any(axis=1)]
Or use regex=False
:
df1 = df[~df.select_dtypes(object).apply(lambda x: x.str.contains('\+', regex=False)).any(axis=1)]
Upvotes: 3
Reputation: 976
df = df[~df['columnname'].str.contains('+', regex=False)]
documentation is here: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.str.contains.html
Upvotes: 0