Reputation: 685
I have a sentence something like below
test_str = r'Mr.X has 23 apples and 59 oranges, his business partner from Colorado staying staying in hotel with phone number +188991234 and his wife and kids are staying away from him'
I would like to replace all digits in the above sentence with '0' and phone number should only have the first digit which is +1.
result = r'Mr.X has 00 apples and 00 oranges, his business partner from Colorado staying staying in hotel with phone number +1******** and his wife and kids are staying away from him'
I have the following regex to replace the phone number pattern (which always has a consistent number of digits).
result = re.sub(r'(.*)?(+1)(\d{8})', r'\1\2********', test_str)
Could i replace other digits with 0 except phone number in one single regex?
Upvotes: 0
Views: 206
Reputation: 163277
If you want to keep the first 3 numbers of the phone number and keep the optional +1
using a single pattern:
(?<!\S)((?:\+1)?)(\d{3})(\d{5})(?!\S)|\d+
In parts
(?<! Negative lookbehind
\S Match any char except a whitespace char
) Close group
( Capture group 1
(?: Non capture group
\+ Match + char
1 Match 1 char
)? Close group and repeat 0 or 1 times
) Close group
( Capture group 2
\d{3} Match a digit and repeat Match 3 times.
) Close group
( Capture group 3
\d{5} Match a digit and repeat Match 5 times.
) Close group
(?! Negative lookahead
\S Match any char except a whitespace char
) Close group
| Or
\d+ Match a digit and repeat 1 or more times
Example code
import re
pattern = r"(?<!\S)((?:\+1)?)(\d{3})(\d{5})(?!\S)|\d+"
s = ("Mr.X has 23 apples and 59 oranges, his business partner from Colorado staying staying in hotel with phone number +188991234 and his wife and kids are staying away from him\n\n"
"This is a tel 12345678 and this is 1234567 123456789")
result = re.sub(
pattern,
lambda x: x.group(1) + x.group(2) + "*" * len(x.group(3)) if x.group(2) else "0" * len(x.group()),
s)
print(result)
Output
Mr.X has 00 apples and 00 oranges, his business partner from Colorado staying staying in hotel with phone number +1889***** and his wife and kids are staying away from him
This is a tel 123***** and this is 0000000 000000000
Upvotes: 0
Reputation: 3555
we could use re.sub with function
for replacing the phone number, could use regex below. all digits follow by +1 will be replace to the equivalant number of *
result = re.sub(r'(?<!\w)(\+1)(\d+)', lambda x:x.group(1) + '*'*len(x.group(2)), test_str)
for replacing other number to 0, can use regex below, all digits not precede with + or digit will be replace by equivalant number of 0
result = re.sub(r'(?<![\+\d])(\d+)', lambda x:'0'*len(x.group(1)), test_str)
example
>>> test_str = r'Mr.X has 23 apples and 59 oranges, his phone number +188991234'
>>> result = re.sub(r'(?<!\w)(\+1)(\d+)', lambda x:x.group(1) + '*'*len(x.group(2)), test_str)
>>> result = re.sub(r'(?<![\+\d])(\d+)', lambda x:'0'*len(x.group(1)), result)
>>> result
'Mr.X has 00 apples and 00 oranges, his phone number +1********'
addon for the follow up question in comment to retain 3 digits of number, we could just modify the 1st regex for the +1 portion, while 2nd regex remains the same
>>> test_str = r'Mr.X has 23 apples and 59 oranges, his phone number +188991234'
>>> result = re.sub(r'(?<!\w)(\+\d{3})(\d+)', lambda x:x.group(1) + '*'*len(x.group(2)), test_str)
>>> result = re.sub(r'(?<![\+\d])(\d+)', lambda x:'0'*len(x.group(1)), result)
>>> result
'Mr.X has 00 apples and 00 oranges, his phone number +188******'
Upvotes: 1