Reputation: 315
I want to count how often a word from one list in a dataframe is in another list in another dataframe. My data looks like this:
df6=pd.DataFrame({'variable':'irreplacable','Words':[['hi','ciao'],
['mine','yours']]})
df7=pd.DataFrame({'text':[['hi','is','this','ciao','ciao'],['hi','ciao']]})
So now i want to count how often 'hi' and 'ciao' are in each cell of df7.text and create a new column in df7 containg this count
i tried to create a "double" for loop:
count_word = 0
for index,rows in df7.iterrows():
for word in df7.text:
if word in df6.iloc[0,1]:
count_word = count_word +1
df7['counter']=count_word
with this code the ouput looks like
text counter
0 [hi, is, this, ciao, ciao] 0
1 [hi, ciao] 0
instead of 3 and 2 for counter
Upvotes: 1
Views: 138
Reputation: 862611
Use generator with sum
for count True
value with in
for test membership:
df7['counter']= df7.text.apply(lambda x: sum(i in df6.iloc[0,1] for i in x))
print (df7)
text counter
0 [hi, is, this, ciao, ciao] 3
1 [hi, ciao] 2
A bit modify solution for test all another values to new columns:
for v in df6['Words']:
df7[', '.join(v)]= df7.text.apply(lambda x: sum(i in v for i in x))
print (df7)
text hi, ciao mine, yours
0 [hi, is, this, ciao, ciao] 3 0
1 [hi, ciao] 2 0
Upvotes: 1