Reputation: 8187
Following the question here I'm trying to replace a hyphen if it does not appear in a US postal code.
The logic is:
I've tried to acheive this using:
import re
p = re.compile(r'(?!\d+\-\d+)-') # regex here
test_str = "12345-4567 hello-you"
re.sub(p, " ", test_str)
12345-4567 hello you
12345 4567 hello you
What am I doing wrong?
Upvotes: 2
Views: 225
Reputation: 626903
You may use
import re
p = re.compile(r'(?!(?<=\d)-\d)-')
test_str = "12345-4567 hello-you 45-year N-45"
print(re.sub(p, " ", test_str))
# => 12345-4567 hello you 45 year N 45
See the Python demo and the regex demo.
The (?!(?<=\d)-\d)-
regex matches a
(?!(?<=\d)-\d)
- a location in a string that is not immediately followed with a -
(that is immediately preceded with a digit) followed with a digit-
- a hyphen.Another approach is to match and capture postal code like strings to keep them and replace -
in all other contexts:
re.sub(r'\b(\d{5}-\d{4})\b|-', r'\1 ', text)
See the regex demo and the Python demo.
Note \b(\d{5}-\d{4})\b
matches and captures into Group 1 a word boundary position first, then matches any five digits, a hyphen, four digits and again a word boundary. The \1
backreference in the replacement pattern refers to the value captured in Group 1.
Upvotes: 3