Reputation: 443
I have the following kind of string:"§ 9,12,14,15 und 16"
or "§ 9,12 und 16"
.
I want to change the string to:"§ 9, § 12, § 14, § 15 und §16"
or "§ 9, § 12 und § 16"
, respectively.
The number of digits varies and I want a code snipped that is applicable for all length:
text = "§§ 9,12,14,15 und 16"
text = re.sub(r'§* (\d+),(\d+),(\d+),(\d+) und (\d+)', r'§ \1, § \2, § \3, § \4 und § \5', text)
I only manage to match the string if I know the number of digits.
Upvotes: 1
Views: 197
Reputation: 110675
You can do that by using re.sub
with a regular expression and a lambda for replacements.
str = "aaa § 9,12,14,15 und 16 bbb"
rgx = r'(?:,|(?<!§) )(?=\d)'
re.sub(rgx, lambda m: ', § ' if m.group() == ',' else ' §', str)
#=> "aaa § 9, § 12, § 14, § 15 und §16 bbb"
Regex demo¯\(ツ)/¯Python demo
The regular expression can be broken down as follows.
(?: # begin a non-capture group
, # match a comma
| # or
(?<!§) # next character cannot be preceded by '§'
[ ] # match a space
) # end non-capture group
(?=\d) # next character must be a digit
(?<!§)
is a negative lookbehind; (?=\d)
is a positive lookahead. I've placed the space in a character class ([ ]
) merely to make it visible.
Upvotes: 1
Reputation: 189357
There is no single regex which can do that. What you can do is split your string into parts, and perform a substitution on each.
text = "§§ 9,12,14,15 und 16"
parts = re.search(r'(§*)\s*((?:\d+,?\s*)+)\s*und\s+(\d+)', text)
if parts:
sections = parts.group(2)
text = re.sub(r'(\d+)', r'§\1', parts.group(2)) + ' und §' + parts.group(3)
The spacing in your example ends up being a bit irregular but this can be fixed up with some light post-processing.
text = re.sub(r',(?!\s)', ', ', re.sub('\s+', ' ', text))
Upvotes: 1