Reputation: 911
Using balancing groups, it is easy to match brackets. Example, using s*\{(?:[^{}]|(?<counter>\{)|(?<-counter>\}))+(?(counter)(?!))\}
, you correctly check the parentheses in this example:
{
{
"correct";
}
}
One issue I have is that this code doesn't work if there is a string with a parentheses inside, i.e.
{
{
"wrong}";
}
}
Checking that the quotes are matched isn't difficult, but I fail to see how to adapt that into the original regex. How would I make it so that the balancing group ignores brackets inside string literals?
Upvotes: 0
Views: 220
Reputation: 22837
Regex is not the best tool to use for what you're trying to do, however, that doesn't mean it's impossible.
s*\{(?:"(?:(?<!\\)\\(?:\\{2})*"|[^"])*"|[^{}]|(?<counter>\{)|(?<-counter>\}))+(?(counter)(?!))\}
Note: I simply prepended "(?:(?<!\\)\\(?:\\{2})*"|[^"])*"|
to your pattern, so I'll only explain that portion.
A shorter method (thanks to PhiLho's answer on regex for a quoted string with escaping quotes) is as follows.
s*\{(?:"(?:[^"\\]|\\.)*"|[^{}]|(?<counter>\{)|(?<-counter>\}))+(?(counter)(?!))\}
I used the same idea as the regex I recently answered another question with and applied it to yours. It allows for escaped double quotes as well as your open/closing curly braces.
"
Match this literally(?:(?<!\\)\\(?:\\{2})*"|[^"])*
Match either of the following any number of times
(?<!\\)\\(?:\\{2})*"
Match the following
(?<!\\)
Negative lookbehind ensuring what precedes is not a literal backslash \
\\
Match a literal backslash(?:\\{2})*
Match two literal backslashes any number of times (2,4,6,8, etc.)"
Match this literally[^"]
Match any character except "
literally"
Match this literallyNote: (?<!\\)\\(?:\\{2})*"
ensures it properly matches escaped double quotes "
. This basically says: Match any odd number of backslashes preceding a double quote character "
such that \"
, \\\"
, \\\\\"
, etc. are valid, and \\"
, \\\\"
, \\\\\\"
are invalid escaped double quotes "
(and thus a string termination)
Upvotes: 2