Reputation: 17
How to match data column with its own regex in another column. Here is how it looks like.
Data Regex
0 HU13568812 ^HU[0-9]{8}
1 78567899 ^NO[0-9]{5}
2 AT1234567 ^HU[0-9]{7}
The output will be a new column for the result if its match (1) or not match (0) like this.
Data Regex Match
0 HU13568812 ^HU[0-9]{8} 1
1 78567899 ^NO[0-9]{5} 0
2 AT1234567 ^AT[0-9]{7} 1
I tried to use re.match() but I can't seem to get it match for the whole row at once. Is there any better way to do this in a simple function or more pythonic way? Thank you.
Upvotes: 1
Views: 269
Reputation: 1169
Your question may like this Add additional column in merged csv file you can do your own logic, it's very flexible
def func(row):
# do your generate logic here,and return
return re.match(row["Regex"],row["Data"]) and 1 or 0
df["Match"]=df.apply(func, axis=1)
print(df)
#lambda for short
df["Match"]=df.apply(lambda r:re.match(r["Regex"],r["Data"]) and 1 or 0, axis=1)
Upvotes: 1
Reputation: 22523
One way is to use list comprehension:
import re
df["Match"] = [1 if re.search(fr"{pat}", data) else 0
for data, pat in zip(df["Data"],df["Regex"])]
print (df)
Data Regex Match
0 HU13568812 ^HU[0-9]{8} 1
1 78567899 ^NO[0-9]{5} 0
2 AT1234567 ^AT[0-9]{7} 1
Upvotes: 3