Reputation: 6322
I came across a regex in the pickaxe Ruby book for finding balanced brace-expressions, and I'm trying to build upon it to get a regex that matches balanced braces/brackets/parens.
The original:
re = /
\A
(?<brace_expression>
{
(
[^{}] # anything other than braces
| # ...or...
\g<brace_expression> # a nested brace expression
)*
}
)
\Z
/x
My version so far:
re = /
\A
(?<brace_expression>
(?:
(?<brace> { ) | (?<bracket> \[ ) | ( \( )
)
(
[^{}\[\]()] # anything other than braces
| # ...or...
\g<brace_expression> # a nested brace expression
)*
(?(<brace>) } | (?(<bracket>) \] | \) ) )
)
\Z
/x
It correctly matches "{xyz}", "[xyz]", "(xyz)", and correctly fails to match something like "{xyz]", but the recursion isn't behaving as I'd expect. It fails to match nested brace expressions like "{[]}". What am I missing?
Upvotes: 3
Views: 531
Reputation: 18555
Interesting question. Your current pattern looks quite good. What about using alternation instead of conditionals which seem to be unreliable when using together with recursion.
re = /
\A( # start group 1
\(([^)(\]\[}{]+|\g<1>)*+\)| # parens & set group 2
\[\g<2>*+\]| # brackets
\{\g<2>*+\} # braces
)\z # group 1 end
/x
\g<1>
is the subexpression call to first group which holds the pattern between start and end.\g<2>
is a call to group 2 which holds [^)(\]\[}{]+|\g<1>
for reducing the pattern.*+
is a possessive quantifier to improve failing-performance if unbalanced.See demo at Regex101 (PCRE) or Rubular v1.9.3+
Upvotes: 2