Reputation: 85
Hi I have a list and a pandas dataframe whose elements are lists as well. I want to find out if any one of elements of pandas column list are present in the other list and create one column with 1 if found and 0 if not found and another column with found elements as string separated by ,
. I found a similar question but couldn`t understand how could I use it for the case here. Check if one or more elements of a list are present in Pandas column. Thank you very much! :)
letters = ['a', 'b', 'c', 'f', 'j']
df_temp = pd.DataFrame({'letters_list' : [['a','b','c'], [ 'd','e','f'], ['g','h','i'], ['j','h','i']]})
How can I create a new column found
which is 1 if any letter in list letters
is found in letters_list
, and another column letters_found
which outputs letters matched in the list as string separated by ,
? It would like like following.
Upvotes: 3
Views: 2051
Reputation: 260735
You need to use a loop here.
Make letters
a set
for efficient testing of common elements with set.intersection
and use a list comprehension. Then check if you found any letter by making "letters_found" as boolean (empty string becomes False
, the rest True
) and converting to int
to have 0/1.
letters = set(['a', 'b', 'c', 'f', 'j'])
df_temp['letters_found'] = [','.join(sorted(letters.intersection(l)))
for l in df_temp['letters_list']]
df_temp['found'] = df_temp['letters_found'].astype(bool).astype(int)
output:
letters_list letters_found found
0 [a, b, c] a,b,c 1
1 [d, e, f] f 1
2 [g, h, i] 0
3 [j, h, i] j 1
Upvotes: 2