Reputation: 395
I have a list and a dataframe with one column named Description that looks like this:
my_list = ['dog','cat','bird'...]
df
| Description |
|three_legged_dog0named1_Charlie|
| catis_mean |
| 1hippo_stepped-on_an_ant |
I want to write a for loop that loops through each row in df and check whether it contains an element in list, if it does, print the element.
normally I'd use search(), but I don't know how it works with a list. I could write a for loop that captures all the cases but I don't want to do that. Is there another way around?
for i in df['Description']:
if i is in my_list:
print('the element that is in i')
else:
print('not in list')
the output should be:
dog
cat
not in list
Upvotes: 1
Views: 1044
Reputation: 863226
If want use pandas non loop method for test is used Series.str.findall
with Series.str.join
for all mateched values joined by ,
and last Series.replace
empty strings:
my_list = ['dog','cat','bird']
df['new'] = (df['Description'].str.findall('|'.join(my_list))
.str.join(',')
.replace('','not in list'))
print (df)
Description new
0 three_legged_dog0named1_Charlie dog
1 catis_mean cat
2 1hippo_stepped-on_an_ant not in list
Upvotes: 1
Reputation: 294488
pd.Series.str.replace
pattern = f'^.*({"|".join(my_list)}).*$'
# Create a mask to rid ourselves of the pesky no matches later
mask = df.Description.str.match(pattern)
# where the magic happens, use `r'\1'` to swap in the thing that matched
df.Description.str.replace(pattern, r'\1', regex=True).where(mask, 'not in list')
0 dog
1 cat
2 not in list
Name: Description, dtype: object
Upvotes: 1