Filter out all rows in a dataframe containing '**'

Question

I am trying to filter out all rows in a DataFrame that contain the substring '**'.

I have tried doing this with

df = df[~df['title'].str.contains('**')]

However I keep getting an error

error: nothing to repeat at position 0

and can't figure out why.

sacuL · Accepted Answer

You have to escape the * character using \, as it is being read as the special regex character * (meaning zero or more). In your case:

df[~df['title'].str.contains('\*\*')]

Example:

>>> df
   title
0    xyz
1  x**yz
2     **
3     x*

df[~df['title'].str.contains('\*\*')]

  title
0   xyz
3    x*

Answers (2)