Substr replace based on regex expression using pandas

Question

I am trying to modify data frame values and mask IP addresses using regex.

I have a list of IP addresses and I am trying to mask them in the data frame:

This is what I have:

123.123.123.123 and I am expecting to get 12X.XXX.XXX.X23

23.123.123.123 and I am expecting to get 23.XXX.XXX.X23

So I am always leaving 2 first and 2 last elements of IP, the rest of IP I am trying to hide.

David · Accepted Answer

this should help

df['ip_masked']=df.ip.str[:2]+df.ip.apply(lambda x: re.sub('\d','X',x)[2:-2])+df.ip.str[-2:]

Answers (2)