Lee Roy
Lee Roy

Reputation: 297

Get the count of strings found in column b from column a in any order and return counts in a new column

I'm trying to get the count of substrings in column b which match column a in any order.

example:

[col a]                   [col b]                             [frequency]
big red car            elon musk drives a big red car              1
elon musk car          elon musk drives a big red car              1
red big car            elon musk drives a big red car              1

The maximum amount of matches needs to be fixed at 1. e.g. big red car would only match once, instead of matching for every combination.

I'd need to return exact match on the words if possible. car does not match for card and so on.

what I've tried:

df["frequency"] = df.apply(lambda x: x['col b'].count(x['col a']), axis=1)

this finds exact matches only, but I need them to be matched in any order.

Any help appreciated.

Upvotes: 0

Views: 47

Answers (2)

Anurag Dabas
Anurag Dabas

Reputation: 24304

Try via str.contains():

words='|'.join(df['[col a]'].unique())
#Finally:
df['[frequency]']=df['[col b]'].str.contains(words).astype(int)
#OR
df['[frequency]']=df['[col b]'].str.contains(words).view('i1')

output of df:

[col a]                   [col b]                             [frequency]
big red car            elon musk drives a big red car              1
elon musk car          elon musk drives a big red car              1
red big car            elon musk drives a big red car              1

Upvotes: 1

mozway
mozway

Reputation: 260300

Assuming you want to check that all the words in "[col A]" are in "[col B]":

def ismatch(s):
    A = set(s['[col a]'].split())
    B = set(s['[col b]'].split())
    return A.intersection(B) == A
df.apply(ismatch, axis=1)

input:

         [col a]                         [col b]  [frequency]
0    big red car  elon musk drives a big red car            1
1  elon musk car  elon musk drives a big red car            1
2    red big car  elon musk drives a big red car            1
3   red big card  elon musk drives a big red car            1

output:

0    True
1    True
2    True
3   False

Upvotes: 1

Related Questions