Reputation: 3299
The objective is to find whether each cell in a column contain either one of the string from a set of strings.
Given a df below;
df = pd.DataFrame({'Date': ['1 December 2020', 'May 2020', 'Random value', 'may 2019']})
and a flag sub string of
stk_list = ['December','May']
It should return
True; True; False; True
However, using the code below
stk_list = ['December','May']
df = pd.DataFrame({'Date': ['1 December 2020', 'May 2020', 'Random value', 'may 2019']})
dfxx=df['Date'].isin(stk_list)
it return
False; False;False;False
May I know where did I do wrong?
Upvotes: 1
Views: 41
Reputation: 4658
You can use apply
for this:
dfxx=df['Date'].apply(lambda x: any([i in x for i in stk_list])
Upvotes: 1
Reputation: 150735
You can use regex
for this purpose:
pattern = '|'.join(s.lower() for s in stk_list)
df.Date.str.lower().str.contains(pattern)
You may want to modify pattern
so that it only checks for the whole words:
pattern = r"\b({})\b".format('|'.join(s.lower() for s in stk_list))
Here \b
denotes word boudary, so this pattern would not flag maybe
as True
.
Upvotes: 1