Vinay billa
Vinay billa

Reputation: 309

How to filter a pandas dataframe using multiple partial strings?

I understand how to filter a dataframe in pandas using a single or two partial strings:

final_df = df[df['Answers'].str.contains("not in","not on")]

I got the help from this link: Select by partial string from a pandas DataFrame

However I am unable to extend the filtering to more than two partial strings.

final_df = df[df['Answers'].str.contains("not in","not on","not have")]

If I try, I get the following error:

TypeError: unsupported operand type(s) for &: 'str' and 'int'

How do I tweak if I have to extend the filtering based on multiple partial strings? Thank You.

Upvotes: 0

Views: 419

Answers (1)

Space Impact
Space Impact

Reputation: 13255

Use str.contains with | for multiple search elements:

mask = df['Answers'].str.contains(regex_pattern)
final_df = df[mask]

To create the regex pattern if you have the search elements use:

strings_to_find = ["not in","not on","not have"]
regex_pattern = '|'.join(strings_to_find)
regex_pattern 
'not in|not on|not have'

Upvotes: 4

Related Questions