Aqilah
Aqilah

Reputation: 17

Match column with its own regex in another column Python

How to match data column with its own regex in another column. Here is how it looks like.

        Data            Regex
0   HU13568812       ^HU[0-9]{8}
1   78567899         ^NO[0-9]{5}
2   AT1234567        ^HU[0-9]{7}

The output will be a new column for the result if its match (1) or not match (0) like this.

            Data            Regex     Match
0   HU13568812       ^HU[0-9]{8}        1
1   78567899         ^NO[0-9]{5}        0
2   AT1234567        ^AT[0-9]{7}        1

I tried to use re.match() but I can't seem to get it match for the whole row at once. Is there any better way to do this in a simple function or more pythonic way? Thank you.

Upvotes: 1

Views: 269

Answers (2)

JianMing Wang
JianMing Wang

Reputation: 1169

Your question may like this Add additional column in merged csv file you can do your own logic, it's very flexible

def func(row):
     # do your generate logic here,and return
    return re.match(row["Regex"],row["Data"]) and 1 or 0
df["Match"]=df.apply(func, axis=1)
print(df)

#lambda for short
df["Match"]=df.apply(lambda r:re.match(r["Regex"],r["Data"]) and 1 or 0, axis=1)

Upvotes: 1

Henry Yik
Henry Yik

Reputation: 22523

One way is to use list comprehension:

import re

df["Match"] = [1 if re.search(fr"{pat}", data) else 0 
               for data, pat in zip(df["Data"],df["Regex"])]

print (df)

         Data        Regex  Match
0  HU13568812  ^HU[0-9]{8}      1
1    78567899  ^NO[0-9]{5}      0
2   AT1234567  ^AT[0-9]{7}      1

Upvotes: 3

Related Questions