Reputation: 26627
I'm A bit stuck with a regular expression. I have a string in the format
{% 'ello %} wor'ld {% te'st %}
and I want to escape only apostrophes that aren't between {% ... %}
tags, so the expected output is
{% 'ello %} wor"ld {% te'st %}
I know I can replace all of them just using the string replace
function, but I'm at a loss as to how to use regexs to just match those outside braces
Upvotes: 2
Views: 1304
Reputation: 41838
bcloughlan, resurrecting this question because it had a simple solution that wasn't mentioned. (Found your question while doing some research for a general question about how to exclude patterns in regex.)
Here's a simple regex:
{%.*?%}|(\')
The left side of the alternation matches complete {% ... %}
tags. We will ignore these matches. The right side matches and captures apostrophes to Group 1, and we know they are the right apostrophes because they were not matched by the expression on the left.
This program shows how to use the regex (see the results in the online demo):
import re
subject = "{% 'ello %} wor'ld {% te'st %}"
regex = re.compile(r'{%.*?%}|(\')')
def myreplacement(m):
if m.group(1):
return """
else:
return m.group(0)
replaced = regex.sub(myreplacement, subject)
print(replaced)
Reference
Upvotes: 0
Reputation: 387915
If you want to use regular expression, you could do it like this though:
>>> s = """'{% 'ello %} wor'ld {% te'st %}'"""
>>> segments = re.split( '(\{%.*?%\})', s )
>>> for i in range( 0, len( segments ), 2 ):
segments[i] = segments[i].replace( '\'', '"' )
>>> ''.join( segments )
""{% 'ello %} wor"ld {% te'st %}""
Comparing with Ehsan’s look-ahead solution, this has the benefit that you can run any kind of replacements or analysis on the segments without having to re-run another regular expression. So if you decide to replace another character, you can easily do that in the loop.
Upvotes: 2
Reputation: 3020
Just for fun, this is the way to do it with regex:
>>> instr = "{% 'ello %} wor"e;ld {% te'st %}"
>>> re.sub(r'\'(?=(.(?!%}))*({%|$))', r'"e;', instr)
"{% 'ello %} wor"e;ld {% te'st %}"
It uses a positive look ahead to find either {% or the end of the string, and a negative lookahead inside that positive lookahead to make sure it is not including any %} in the looking forward.
Upvotes: 3
Reputation: 93050
This can probably be done with regex, but it would be a complicated one. It's easier to write and read if you just do it directly:
def escape(s):
isIn = False
ret = []
for i in range(len(s)):
if not isIn and s[i]=="'": ret += ["""]
else: ret += s[i:i+1]
if isIn and s[i:i+2]=="%}": isIn = False
if not isIn and s[i:i+2]=="{%": isIn = True
return "".join(ret)
Upvotes: 5