Reputation: 217
I have a data frame which has ids, some code value,code_name and a list_of_ids is a list of ids .
id code code_name list_of_ids
1 AB111 apple [5,2,3,4]
2 AB222 pear [3,4]
3 AB333 banana [1]
4 AB111 apple []
5 AB222 pear [2,6,4]
6 AB333 banana [1,3,5]
7 AB333 banana [1,3,6]
I'm trying to get a count of ids from the list_of_ids where id from the list has the same code_name as a value in id column.
Thanks. The desired output is to get the count. Column for_explanation is added to display only
id code code_name for_explanation count
1 AB111 apple [pear,pear,banana,apple] 1
2 AB222 pear [banana,apple] 0
3 AB333 banana [apple] 0
4 AB111 apple [] 0
5 AB222 pear [pear,banana,apple] 0
6 AB333 banana [apple,banana,pear] 0
7 AB333 banana [apple,banana,banana] 2
Upvotes: 0
Views: 24
Reputation: 260390
You can use a list comprehension:
s = df.set_index('id')['code_name']
df['count'] = [sum(1 for i in l if s.get(i)==c)
for c,l in zip(df['code_name'], df['list_of_ids'])]
Or explode
and groupby
:
s = df.set_index('id')['code_name']
df['count'] = (df
.explode('list_of_ids')
.groupby(level=0)
.apply(lambda d: d['list_of_ids'].map(s).eq(d['code_name']).sum())
)
output:
id code code_name list_of_ids count
0 1 AB111 apple [5, 2, 3, 4] 1
1 2 AB222 pear [3, 4] 0
2 3 AB333 banana [1] 0
3 4 AB111 apple [] 0
4 5 AB222 pear [2, 6, 4] 1
5 6 AB333 banana [1, 3, 5] 1
6 7 AB333 banana [1, 3, 6] 2
Upvotes: 1