Reputation: 580
>>> base64_encode = lambda url : url.encode('base64').replace('\n', '')
>>> s = '<A HREF="http://www.google.com" ID="test">blah</A>'
>>> re.sub(r'(?<=href=")([\w:/.]+)(?=")', base64_encode(r'\1'), s, flags=re.I)
<A HREF="XDE=" ID="test">blah</A>
The base64 encoding of the string http://www.google.com
is aHR0cDovL3d3dy5nb29nbGUuY29t
not XDE=
, which is the encoding of \1
.
How do I pass the captured group into the function?
Upvotes: 6
Views: 3402
Reputation: 36708
Write your function to take a single parameter, which will be a match object (see http://docs.python.org/2.7/library/re.html#match-objects for details on these). Inside your function, use m.group(1)
to get the first group from your match object m
.
And when you pass the function to re.sub, don't use parentheses:
re.sub("some regex", my_match_function, s, flags=re.I)
Upvotes: 4
Reputation: 310069
You pass a function to re.sub
and then you pull the group from there:
def base64_encode(match):
"""
This function takes a re 'match object' and performs
The appropriate substitutions
"""
group = match.group(1)
... #Code to encode as base 64
return result
re.sub(...,base64_encode,s,flags=re.I)
Upvotes: 12