Reputation: 25
I am just recently learning and utilizing the power of regular expressions
I have a tuple list of files returned from os.walk()
, like so:
files = ('s8_00.tif', 's9_00.tif', 's10_000.tif', 's11_00.tif')
I am trying to get it to look like this:
files = ('s8_##.tif', 's9_##.tif', 's10_###.tif', 's11_##.tif')
I have tried to use this.
pad2 = re.compile(r'_00?')
for root, dirs, files in seqDirs:
pad = files[0]
p = pad2.sub("#", pad)
print p
This returns:
p = ('s8#.tif', 's9#.tif', 's10#0.tif', 's11#.tif')
So I changed the expression around to:
pad2 = re.compile('(_)0+')
giving me:
p = ('s8#.tif', 's9#.tif', 's10#.tif', 's11#.tif')
Is the problem in my p = pad2.sub
function? Or is the problem exist within my compiled expression? Or is it the "_"
being in the expression that is screwing it up?
I tried even passing some expression inside the pad2.sub
function just to test it out and of course that didn't really work. I know I am missing something little here and I am a bit stuck.
Any and all help will be greatly appreciated along with explanations of logic.
Upvotes: 1
Views: 137
Reputation: 2671
If you want to do it where any number could be there, make your regex be
pattern = re.compile("_(\d+)")
and do the substitution by
pattern.sub("_"+len("\g<1>")*"#", filename)
In any regex you can access what was caught with the parens with "\g<1>" for the first value, "\g<2>" for the next set of parens and so on. "\d+" is going to get any digit character in the expression. If you very specifically just want to look for zeros, you could replace it with "_(0+)"
Upvotes: 2
Reputation: 8818
You're better off finding the matches, calculating the length of them, and then replacing them with that number of #
s.
Upvotes: 0
Reputation: 13232
We're going to use a function for the replacement, not a string.
def replacer(data):
return re.sub(r'(?<=_)(0+)', lambda m: m.group(0).replace('0', '#'), data)
files = ('s8_000.tif', 's9_00.tif', 's10_000.tif', 's11_00.tif')
map(replacer, files)
print(files)
?<=
is a positive lookbehind assertion. You can find an explanation in the docs at Regular Expression Syntax.
0+
captures all following zeros
The lambda function replaces every 0
with the #
.
Upvotes: 5