asimo
asimo

Reputation: 2500

filter a dataframe based on partial string values that also have special characters

I am trying to filter a dataframe based partial strings in the dataframe column that would match with the list of values i have in a list.

The issue is that some of the matching strings have special characters in them, Eg:

=OEAKPOB|2OEAQPYA0402343|@@EAY632|@@EAY6XF3260| LD93684589|4+EB484K|4+EB481W|4*EBEWRX||=OEAKQJW|VNEAKX74

and when i try

pat = '|'.join(criteria_filter['ID'])
df_B = detfile_df[detfile_df['ID'].str.contains(pat)]

I get a

error: nothing to repeat

Now i guess this is due to a bug or the inability of my two line code above to deal with special characters.

Can you help on how i can fix this ?

Upvotes: 1

Views: 25

Answers (1)

jezrael
jezrael

Reputation: 862911

You can escape special regex characters by re.escape in generator comprehension:

import re
pat = '|'.join(re.escape(x) for x in criteria_filter['ID'])

Upvotes: 1

Related Questions