Reputation: 2253
I have the following list:
new_pets = ['Bobcats', 'dog', 'cat', 'turtle', 'monkey', 'goat', 'ferret', 'pig', 'Chipmunks', 'Capybaras', 'Ducks']
And the following pandas dataframe:
In: df
0 Cats
1 Lizard
2 Dog
3 Baby Ferrets
4 Pig
5 Armadillo
How can I get into a new column the elements of new_pets
which appear in df
(*)?:
In: df['new_col']
0 True
1 False
2 True
3 True
4 True
5 False
From the docs, I noticed that this can be done with contains(), so I tried the following:
result = df[df['pets'].str.contains(x, case = False) for x in new_pets]
However, I am not sure if this is possible. For instance, is it possible to match Baby Ferrets
with ferret
, since Ferrets
is similar to ferret
?. For that constraint, I tried to use case=False
, but I am not getting the expected outcome (*). Any idea of how to retrive such strings in a new dataframe?.
Upvotes: 1
Views: 46
Reputation: 863431
You can first join values by |
(regex or is |
) and convert all values to lower-case by lower
- output is in joined
. Then lowercase all values in column by str.lower
and call str.contains
with joined
for checking if bobcats
or dog
or dog
... is in column:
print (df)
pets
0 Cats
1 Lizard
2 Dog
3 Baby Ferrets
4 Pig
5 Armadillo
joined = '|'.join(new_pets).lower()
df['new_col'] = df['pets'].str.lower().str.contains(joined)
print (df)
a new_col
0 Cats True
1 Lizard False
2 Dog True
3 Baby Ferrets True
4 Pig True
5 Armadillo False
Upvotes: 2