str.contains to match entire string - Python

Question

I am trying to check whether a certain list includes elements of another list.

I am using the following line of code:

check = df_1['website'].str.contains(df_2['website'].tolist()[i])

The problem that I am facing now is that I receive false positives, if the first df partially includes the strings in the second one.

For example I am looking to find if the following string in df_2['website'] is contained in df_1['website']:

sample_text_to_check

Since df_1['website'] contains the following string:

text_to_check

It results in a positive match. I would like to check for exact matches only (i.e. the entire string is matched and not only some letters within it.

How can I do that? The lists is 200k lines long and contains many different strings.

Tim Biegeleisen · Accepted Answer

You could just place ^ and $ boundary markers around the string:

check = df_1['website'].str.contains(r'^' + df_2['website'].tolist()[i] + r'$')

Answers (1)