Reputation: 75
I have a DataFrame like this:
df = pd.DataFrame({
'col_1':['filmeinlage federspeicher anlegen',
'filmeinlage lm a-kreis',
'weco-pvb-primerspray ral 3012',
'tragrolle unten (metall) talent,t3',
'metallschutzschlauch, spr-va 36',
'gummi pflege liqui moly 500ml',
'gummikugel für 5-stellungskippschalter',
'megaphone er-520 6/10w abs',
'weco primerspray -lar- 3012'],
'col_2':['lm',
'lm',
'pvb',
'metall',
'metall',
'gummi',
'gummi',
'abs',
'lar']
})
I would like to check if the string in Col_2 is present in Col_1, but only if it is on its own or is surrounded by the special characters, and if this is the case I would like to return True in the new column and False if otherwise, like shown in the example.
For an instance if Col_2 has a string 'lm' and Col_1 has 'filmeinlage' it should return False, but if Col_1 has 'filmeinlage lm a-kreis' it should return True
Col_1 | Col_2 | Desired_Column |
---|---|---|
filmeinlage federspeicher anlegen | lm | False |
filmeinlage lm a-kreis | lm | True |
weco-pvb-primerspray ral 3012 | pvb | True |
tragrolle unten (metall) talent,t3 | metall | True |
metallschutzschlauch, spr-va 36 | metall | False |
gummi pflege liqui moly 500ml | gummi | True |
gummikugel für 5-stellungskippschalter | gummi | False |
megaphone er-520 6/10w abs | abs | True |
weco primerspray -lar- 3012 | lar | True |
Upvotes: 0
Views: 77
Reputation: 18306
You're looking for "word boundaries", i.e. "\b" in regexes:
df["new"] = [re.search(fr"\b{re.escape(c2)}\b", c1) is not None
for c1, c2 in zip(df["col_1"], df["col_2"])]
re.escape
is there to prevent possible special characters within col_2 values
to get
>>> df
col_1 col_2 new
0 filmeinlage federspeicher anlegen lm False
1 filmeinlage lm a-kreis lm True
2 weco-pvb-primerspray ral 3012 pvb True
3 tragrolle unten (metall) talent,t3 metall True
4 metallschutzschlauch, spr-va 36 metall False
5 gummi pflege liqui moly 500ml gummi True
6 gummikugel für 5-stellungskippschalter gummi False
7 megaphone er-520 6/10w abs abs True
8 weco primerspray -lar- 3012 lar True
Upvotes: 1