Reed Debaets
Reed Debaets

Reputation: 522

ruby recursive regex

So why is this not working? I'm creating a regex that will match a formula (which is then part of a larger standard description). But I'm stuck here, as it doesn't appear to want to match embedded formulas within a formula.

stat        = /(Stat3|Stat2|Stat1)/

number_sym  = /[0-9]*/
formula_sym = /((target's )?#{stat}|#{number_sym}|N#{number_sym})\%?/
math_sym    = /(\+|\-|\*|\/|\%)/

formula     = /^\((#{formula}|#{formula_sym})( #{math_sym} (#{formula}|#{formula_sym}))?\)$/

p "(target's Stat2 * N1%)".match(formula).to_s #matches
p "((target's Stat2 * N1%) + 3)".match(formula).to_s #no match
p "(Stat1 + ((target's Stat2 * N1%) + 3))".match(formula).to_s #no match

Upvotes: 6

Views: 3976

Answers (3)

Peter Marreck
Peter Marreck

Reputation: 53

/(
  (?<non_grouping_char>
    [^\(\{\[\<\)\}\]\>]
  ){0}
  (?<parens_group>
    \( \g<content> \)
  ){0}
  (?<brackets_group>
    \[ \g<content> \]
  ){0}
  (?<chevrons_group>
    \< \g<content> \>
  ){0}
  (?<braces_group>
    \{ \g<content> \}
  ){0}
  (?<balanced_group>
    (?>
      \g<parens_group>   |
      \g<brackets_group> |
      \g<chevrons_group> |
      \g<braces_group>
    )
  ){0}
  (?<content>
    (?> \g<balanced_group> | \g<non_grouping_char> )*
  ){0}
  \A \g<content> \Z
)/uix

Beer me if this helps you. Works for me. Works in any regexp engine that allows named groups. It will validate any content that has either no groups, or groups of nesting characters, to any depth.

Upvotes: 5

Kathy Van Stone
Kathy Van Stone

Reputation: 26271

You can't use recursion like that: the #{formula}s in your definition of formula are converted into empty strings. What you want is beyond regular expression's ability -- regular expressions cannot even match nested parentheses. I suspect you will need an actual parser to do what you want. Check out treetop, for example.

Upvotes: 1

Paige Ruten
Paige Ruten

Reputation: 176645

When you use the #{ } syntax, Ruby converts the Regexp object to a string using to_s. Look what happens when you convert a Regexp object to a string:

irb> re = /blah/
  => /blah/
irb> re.to_s
  => "(?-mix:blah)"
irb> "my regex: #{re}"
  => "my regex: (?-mix:blah)"
irb> /my regex: #{re}/
  => /my regex: (?-mix:blah)/

To get the string you want (in my example, "blah"), use the Regexp#source method:

irb> re.source
"blah"

So to use your example:

formula_sym = /((target's )?#{stat.source}|#{number_sym.source}|N#{number_sym.source})\%?/

Upvotes: 7

Related Questions