Gerrie van Wyk
Gerrie van Wyk

Reputation: 697

regex to not match when string between two strings

I have the following parsing scenario in python, there is cases of lines:

  1. {{ name xxxxxxCONTENTxxxxx /}}
  2. {{ name }} xxxxxxxCONTENTxxxxxxx {{ name /}}
  3. {{ name xxxxxxCONTENTxxx {comand} xxxxCONTENTxxx /}}

All I need to do is classify to which case the given line belongs using regex.

I can successfully classify between 1) and 2) but having trouble to deal with 3).

to catch 1) I use:

re.match('\s*{{[^{]*?/}}\s*',line)

to catch 2) I use:

re.match('{{.*?}}',line)

and then raise a flag to keep the context since case 2) can be over multiple lines. How can I catch case 3) ??

The condition which I'm currently trying to match is to test for:

- start with '{{'
- end with '/}}'
- with no '{{' in between

However I'm having a hard time phrasing this in regex.

Upvotes: 1

Views: 1018

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626748

The conditions:

- start with '{{'
- end with '/}}'
- with no '{{' in between

are a perfect fit for a tempered greedy token.

^{{(?:(?!{{|/}}).)*/}}$
   ^^^^^^^^^^^^^^^^

See regex demo.

The (?:(?!{{|/}}).)* matches any text that is not {{ and /}} (thus matches up to the first /}}). Anchors (^ and $) allow to only match a whole string that starts with {{ and ends with /}} and has no {{ inside. Note that with re.match, you do not neet ^ anchor.

Now, to only match the 3rd type of strings, you need to specify that your pattern should have {....}:

^{{(?:(?!{{|/}}).)*{[^{}]*}(?:(?!{{|/}}).)*/}}$
   | ----  1 -----|| - 2 -||--------1-----|

See another regex demo

Part 1 is the tempered greedy token described above and {[^{}]*} matches a single {...} substring making it compulsory inside the input.

Upvotes: 1

Related Questions