Reputation: 1096
Let’s say I want to find all credit card numbers in a 'text' and replace the first three 4-digit groups with XXXX, leaving the last group as it is.
How can I do this with re.sub()?
My best try so far is
re.sub(r"(\d{4}-){3}", "XXXX-XXXX-XXXX-", text)
But of course this pattern would cause a replacement in non-credit card expressions like '1234-5678-1234-asdfg'.
Upvotes: 3
Views: 3732
Reputation: 65791
You could use a lookahead assertion:
re.sub(r"(\d{4}-){3}(?=\d{4})", "XXXX-XXXX-XXXX-", text)
E.g.:
In [1]: import re
In [2]: text = '1234-5678-9101-1213 1415-1617-1819-hello'
In [3]: re.sub(r"(\d{4}-){3}(?=\d{4})", "XXXX-XXXX-XXXX-", text)
Out[3]: 'XXXX-XXXX-XXXX-1213 1415-1617-1819-hello'
Though this would match asdf1234-4567-1234-4567-asdf as well.
Upvotes: 6
Reputation: 69937
Another way using a backreference:
data = "4220-1234-9948-2245 is a cc num i have and so is 4153-4222-3942-4852 but dont tell anyone"
print re.sub(r"(\d{4}-){3}(\d{4})", "XXXX-XXXX-XXXX-\\2", data)
# XXXX-XXXX-XXXX-2245 is a cc num i have and so is XXXX-XXXX-XXXX-4852 but dont tell anyone
Upvotes: 3