YasserKhalil
YasserKhalil

Reputation: 9538

Filtering rows in pandas dataframe in python

I am trying to filter rows in dataframe by multiple strings and I have searched and found this

search_values = ['vba','google']
df[df[0].str.contains('|'.join(search_values), case=False)]

But this I think based on finding either of the two strings vba or google. How can I join the both strings to be used AND not OR. I mean that the filter should be done if both the strings are there in the column so if this sentence for example I mean vba will be with google in one sentence. This row would be selected because it has both vba and google

Upvotes: 0

Views: 76

Answers (1)

Daniel Geffen
Daniel Geffen

Reputation: 1862

The contains function uses regex to find the rows that match the string. Using | is like OR in regex. If you want rows that contain both, this should work:

search_values = ['vba','google']
df[df[0].str.contains(f'{search_values[0]}.*{search_values[1]}|{search_values[1]}.*{search_values[0]}', case=False)]

The .* part means that any string can be found between the searched terms. So we try to find rows with 'vba', some string and then 'google', or rows with 'google', some string and then 'vba'.

Upvotes: 1

Related Questions