Reputation: 8187
I need to replace all commas except those between numbers (ignoreing spaces). Some examples are:
foo, bar 12,6 => foo bar 12,6
foo, bar 12, 6 => foo bar 12, 6
foo, bar 12 ,6 => foo bar 12 ,6
foo, bar 12, => foo bar 12
foo, bar ,6 => foo bar 6
foo,5 => foo5
foo ,5 => foo 5
logic:
I've tried with a negarive lookahead and negative lookbehind:
# regex:
# (?<!\d)(\s?)(,)(\s?)(?!\d)
import re
def replace_comma_by_space(value):
return re.sub(r'(?<!\d)(\s?)(,)(\s?)(?!\d)', r'\1\3', value)
but this fails on cases like:
foo,5 => foo,5
foo ,5 => foo ,5
I feel like im close, thanks in advance for any help.
Upvotes: 2
Views: 409
Reputation: 626794
You can use
re.sub(r'(\d\s*,)(?=\s*\d)|,', r'\1', text)
See the regex demo. Details:
(\d\s*,)(?=\s*\d)
- a digit, zero or more whitespaces and a comma captured into Group 1 (\1
in the replacement pattern refers to this value), that are followed with zero or more whitespaces and a digit|
- or,
- a comma in any other context.See the Python demo:
import re
strings = ['foo, bar 12,6','foo, bar 12, 6','foo, bar 12 ,6','foo, bar 12,','foo, bar ,6','foo,5','foo ,5']
for s in strings:
print(s, '=>', re.sub(r'(\d\s*,)(?=\s*\d)|,', r'\1', s))
Output:
foo, bar 12,6 => foo bar 12,6
foo, bar 12, 6 => foo bar 12, 6
foo, bar 12 ,6 => foo bar 12 ,6
foo, bar 12, => foo bar 12
foo, bar ,6 => foo bar 6
foo,5 => foo5
foo ,5 => foo 5
With PyPi regex module, you can use
\d+(?:\s*,\s*\d+)+(*SKIP)(*F)|,
See the regex demo. Here, \d+(?:\s*,\s*\d+)+(*SKIP)(*F)
matches one or more digits and then one or more occurrences of a comma enclosed with zero or more whitespaces and then one or more digits, and this whole char sequence is discarded and the next search is started at the failure location thanks to (*SKIP)(*F)
verbs.
See this Python demo:
import regex
rx = regex.compile(r'\d+(?:\s*,\s*\d+)+(*SKIP)(*F)|,')
import re
strings = ['foo, bar 12,6','foo, bar 12, 6','foo, bar 12 ,6','foo, bar 12,','foo, bar ,6','foo,5','foo ,5']
for s in strings:
print(s, '=>', rx.sub('', s))
Upvotes: 5