user11638654
user11638654

Reputation: 315

count matches between two lists in different dataframes python

I want to count how often a word from one list in a dataframe is in another list in another dataframe. My data looks like this:

df6=pd.DataFrame({'variable':'irreplacable','Words':[['hi','ciao'],
['mine','yours']]})
df7=pd.DataFrame({'text':[['hi','is','this','ciao','ciao'],['hi','ciao']]})

So now i want to count how often 'hi' and 'ciao' are in each cell of df7.text and create a new column in df7 containg this count

i tried to create a "double" for loop:

count_word = 0
for index,rows in df7.iterrows():
    for word in df7.text:
        if word in df6.iloc[0,1]:
            count_word = count_word +1
    df7['counter']=count_word

with this code the ouput looks like

   text                        counter
0  [hi, is, this, ciao, ciao]   0
1  [hi, ciao]                   0

instead of 3 and 2 for counter

Upvotes: 1

Views: 138

Answers (1)

jezrael
jezrael

Reputation: 862611

Use generator with sum for count True value with in for test membership:

df7['counter']= df7.text.apply(lambda x: sum(i in df6.iloc[0,1] for i in x))
print (df7)
                         text  counter
0  [hi, is, this, ciao, ciao]        3
1                  [hi, ciao]        2

A bit modify solution for test all another values to new columns:

for v in df6['Words']:
    df7[', '.join(v)]= df7.text.apply(lambda x: sum(i in v for i in x))
print (df7)

                         text  hi, ciao  mine, yours
0  [hi, is, this, ciao, ciao]         3            0
1                  [hi, ciao]         2            0

Upvotes: 1

Related Questions