Precious11
Precious11

Reputation: 31

Delete rows with a certain value in Python and Pandas

I want to delete rows who have certain values. The values that I want to delete have a "+" and are as follows:

cooperative+parallel
passive+prosocial

My dataset consists of 900000 rows, and about 2000 values contain the problem I mentioned.

I want the code something like this:

df = df[df.columnname != '+']

The above is for one column (its not working well) but I would also like one example for whole dataset.

I prefer the solution in Pandas.

Many thanks

Upvotes: 1

Views: 331

Answers (2)

jezrael
jezrael

Reputation: 862601

Use Series.str.contains with invert mask by ~ and escape +, because special regex character with DataFrame.apply for all object columns selected by DataFrame.select_dtypes with DataFrame.any for test at least one match:

df1 = df[~df.select_dtypes(object).apply(lambda x: x.str.contains('\+')).any(axis=1)]

Or use regex=False:

df1 = df[~df.select_dtypes(object).apply(lambda x: x.str.contains('\+', regex=False)).any(axis=1)]

Upvotes: 3

Artyom Akselrod
Artyom Akselrod

Reputation: 976

df = df[~df['columnname'].str.contains('+', regex=False)]

documentation is here: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.str.contains.html

Upvotes: 0

Related Questions