Reputation: 139
I have a df:
d = {'id': [1,2,3,4,5,6,7,8,9,10],
'text': ['bill did this', 'jim did something', 'phil', 'bill did nothing',
'carl was here', 'this is random', 'other name',
'other bill', 'bill and carl', 'last one']}
df = pd.DataFrame(data=d)
And I would like to check if a column contains a value in a list, where the list is:
list = ['bill','carl']
I'd like to return something like this then
id text contains
1 bill did this bill
2 jim did something
3 phil
4 bill did nothing bill
5 carl was here carl
6 this is random
7 other name
8 other bill bill
9 bill and carl bill
9 bill and carl carl
10 last one
Although the way to handle 2 or more names in the same row is open to change. Any suggestions?
Upvotes: 2
Views: 3745
Reputation: 5434
You can create a lambda function to check for every item in your list:
d = {'id': [1,2,3,4,5,6,7,8,9,10],
'text': ['bill did this', 'jim did something', 'phil', 'bill did nothing',
'carl was here', 'this is random', 'other name',
'other bill', 'bill and carl', 'last one']}
df = pd.DataFrame(data=d)
l = ['bill','carl']
df['contains'] = df['text'].apply(lambda x: ','.join([i for i in l if i in x]))
You can remove join if you want the list, else it will just concatenate the values separated by a comma.
Output
>>df['contains']
0 bill
1
2
3 bill
4 carl
5
6
7 bill
8 bill,carl
9
Name: contains, dtype: object
Upvotes: 5