Reputation: 77
I'm trying to search and replace part of strings using re.sub and format capabilities of Python. I want all text like 'ESO \d+-\d+" to be replaced in the format 'ESO \d{3}-\d{3}' using leading zeroes.
I thought that this would work:
re.sub(r"ESO (\d+)-(\d+)" ,"ESO {:0>3}-{:0>3}".format(r"\1",r"\2"), line)
But I get strange results:
'ESO 409-22' becomes 'ESO 0409-022'
'ESO 539-4' becomes 'ESO 0539-04'
I can't see the error, in fact if I use two operations I get the correct result:
>>> ricerca = re.search(r"ESO (\d+)-(\d+)","ESO 409-22")
>>> print("ESO {:0>3}-{:0>3}".format(ricerca.group(1),ricerca.group(2)))
ESO 409-022
Upvotes: 0
Views: 109
Reputation: 36023
"ESO {:0>3}-{:0>3}".format(r"\1",r"\2")
evaluates to the same as:
r"ESO 0\1-0\2"
and then the group substitution proceeds normally, so it just puts a 0 in front of the numbers.
Your last code sample is a very sensible way to solve this problem, stick to it. If you really need to use re.sub
, pass a function as the replacement:
>>> import re
>>> line = 'ESO 409-22'
>>> re.sub(r"ESO (\d+)-(\d+)", lambda match: "ESO {:0>3}-{:0>3}".format(*match.groups()), line)
'ESO 409-022'
>>> help(re.sub)
Help on function sub in module re:
sub(pattern, repl, string, count=0, flags=0)
Return the string obtained by replacing the leftmost
non-overlapping occurrences of the pattern in string by the
replacement repl. repl can be either a string or a callable;
if a string, backslash escapes in it are processed. If it is
a callable, it's passed the match object and must return
a replacement string to be used.
Upvotes: 1