MKII
MKII

Reputation: 911

Ignore brackets in string literal

Using balancing groups, it is easy to match brackets. Example, using s*\{(?:[^{}]|(?<counter>\{)|(?<-counter>\}))+(?(counter)(?!))\}, you correctly check the parentheses in this example:

{
   {
   "correct";
   }
}

One issue I have is that this code doesn't work if there is a string with a parentheses inside, i.e.

{
    {
    "wrong}";
    }
}

Checking that the quotes are matched isn't difficult, but I fail to see how to adapt that into the original regex. How would I make it so that the balancing group ignores brackets inside string literals?

Upvotes: 0

Views: 220

Answers (1)

ctwheels
ctwheels

Reputation: 22837

Brief

Regex is not the best tool to use for what you're trying to do, however, that doesn't mean it's impossible.


Code

See regex in use here

s*\{(?:"(?:(?<!\\)\\(?:\\{2})*"|[^"])*"|[^{}]|(?<counter>\{)|(?<-counter>\}))+(?(counter)(?!))\}

Note: I simply prepended "(?:(?<!\\)\\(?:\\{2})*"|[^"])*"| to your pattern, so I'll only explain that portion.

A shorter method (thanks to PhiLho's answer on regex for a quoted string with escaping quotes) is as follows.

See regex in use here

s*\{(?:"(?:[^"\\]|\\.)*"|[^{}]|(?<counter>\{)|(?<-counter>\}))+(?(counter)(?!))\}

Explanation

I used the same idea as the regex I recently answered another question with and applied it to yours. It allows for escaped double quotes as well as your open/closing curly braces.

  • " Match this literally
  • (?:(?<!\\)\\(?:\\{2})*"|[^"])* Match either of the following any number of times
    • (?<!\\)\\(?:\\{2})*" Match the following
      • (?<!\\) Negative lookbehind ensuring what precedes is not a literal backslash \
      • \\ Match a literal backslash
      • (?:\\{2})* Match two literal backslashes any number of times (2,4,6,8, etc.)
      • " Match this literally
    • [^"] Match any character except " literally
  • " Match this literally

Note: (?<!\\)\\(?:\\{2})*" ensures it properly matches escaped double quotes ". This basically says: Match any odd number of backslashes preceding a double quote character " such that \", \\\", \\\\\", etc. are valid, and \\", \\\\", \\\\\\" are invalid escaped double quotes " (and thus a string termination)

Upvotes: 2

Related Questions