Reputation: 697
I have the following parsing scenario in python, there is cases of lines:
{{ name xxxxxxCONTENTxxxxx /}}
{{ name }} xxxxxxxCONTENTxxxxxxx {{ name /}}
{{ name xxxxxxCONTENTxxx {comand} xxxxCONTENTxxx /}}
All I need to do is classify to which case the given line belongs using regex.
I can successfully classify between 1) and 2) but having trouble to deal with 3).
to catch 1) I use:
re.match('\s*{{[^{]*?/}}\s*',line)
to catch 2) I use:
re.match('{{.*?}}',line)
and then raise a flag to keep the context since case 2) can be over multiple lines. How can I catch case 3) ??
The condition which I'm currently trying to match is to test for:
- start with '{{'
- end with '/}}'
- with no '{{' in between
However I'm having a hard time phrasing this in regex.
Upvotes: 1
Views: 1018
Reputation: 626748
The conditions:
- start with '{{'
- end with '/}}'
- with no '{{' in between
are a perfect fit for a tempered greedy token.
^{{(?:(?!{{|/}}).)*/}}$
^^^^^^^^^^^^^^^^
See regex demo.
The (?:(?!{{|/}}).)*
matches any text that is not {{
and /}}
(thus matches up to the first /}}
). Anchors (^
and $
) allow to only match a whole string that starts with {{
and ends with /}}
and has no {{
inside. Note that with re.match
, you do not neet ^
anchor.
Now, to only match the 3rd type of strings, you need to specify that your pattern should have {....}
:
^{{(?:(?!{{|/}}).)*{[^{}]*}(?:(?!{{|/}}).)*/}}$
| ---- 1 -----|| - 2 -||--------1-----|
Part 1 is the tempered greedy token described above and {[^{}]*}
matches a single {...}
substring making it compulsory inside the input.
Upvotes: 1