Reputation: 9538
I am trying to filter rows in dataframe by multiple strings and I have searched and found this
search_values = ['vba','google']
df[df[0].str.contains('|'.join(search_values), case=False)]
But this I think based on finding either of the two strings vba
or google
. How can I join the both strings to be used AND
not OR
. I mean that the filter should be done if both the strings are there in the column so if this sentence for example I mean vba will be with google in one sentence
. This row would be selected because it has both vba
and google
Upvotes: 0
Views: 76
Reputation: 1862
The contains function uses regex to find the rows that match the string. Using |
is like OR
in regex. If you want rows that contain both, this should work:
search_values = ['vba','google']
df[df[0].str.contains(f'{search_values[0]}.*{search_values[1]}|{search_values[1]}.*{search_values[0]}', case=False)]
The .*
part means that any string can be found between the searched terms. So we try to find rows with 'vba'
, some string and then 'google'
, or rows with 'google'
, some string and then 'vba'
.
Upvotes: 1