erikscandola
erikscandola

Reputation: 2936

Match string between a specific word and paired brackets after it with a single nested level support with an exception

I've a problem with a regex match. I need to find a specific substring in a string. Some examples:

1. IF[A != B; C[0]; D] ==> IF[A != B; C[0]; D]
2. IF[A != B; IF[E < F; ...; ...]; D] ==> IF[E < F; ...; ...]
3. IF[A != B; C; D] ==> IF[A != B; C; D]

So, I have this regula expression: IF\[([^\[\]]*)\]. It work fine in case 2 and 3, but in case 1 there is C[0] that contains square brackets.

I tried to change my regex in this way: IF\[((?!IF))\] and finaly IF\[(.+(?!IF))\]. I added a look ahead to say it "keep the IF that does not contains another IF". Now it works in case 1 and 3 but case 2 returns entire string.

How can I create a correct look head to solve this problem? I need to find the most internal IF in the string that can be the entire string.

I alredy tried with solution in this answer: https://stackoverflow.com/a/32747960/5731129

Upvotes: 1

Views: 89

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626691

You want to match IF[...] substrings where the string between square brackets may contain another pair of square brackets unless preceded with an IF, with just a single nested bracket level.

For that, you may use

IF\[([^][]*(?:(?<!\bIF)\[[^][]*][^][]*)*)]

See the regex demo

Details

  • IF\[ - an IF[ substring
  • ([^][]*(?:(?<!\bIF)\[[^][]*][^][]*)*) - Group 1:
    • [^][]* - 0+ chars other than [ and ]
    • (?:(?<!\bIF)\[[^][]*][^][]*)* - 0 or more occurrences of
      • (?<!\bIF)\[ - a [ char that is not immediately preceded with a whole word IF (\b is a word boundary)
      • [^][]* - 0+ chars other than [ and ]
      • ] - a ] char
      • [^][]* - 0+ chars other than [ and ]
  • ] - a ] char.

Upvotes: 1

Related Questions