Jeremy Smith
Jeremy Smith

Reputation: 15069

How are named capture groups used in RE2 regexps?

On this page http://swtch.com/~rsc/regexp/regexp3.html it says that RE2 supports named expressions.

RE2 supports Python-style named captures (?P<name>expr), but not the alternate syntaxes (?<name>expr) and (?'name'expr) used by .NET and Perl.

ruby-1.9.2-p180 :003 > r = RE2::Regexp.compile("(?P<foo>.+) bla")   
#=> #<RE2::Regexp /(?P<foo>.+) bla/>

ruby-1.9.2-p180 :006 > r = r.match("lalal bla")   
#=> #<RE2::MatchData "lalal bla" 1:"lalal">

ruby-1.9.2-p180 :009 > r[1]   #=> "lalal"

ruby-1.9.2-p180 :010 > r[:foo]
TypeError: can't convert Symbol into Integer

ruby-1.9.2-p180 :011 > r["foo"]
TypeError: can't convert String into Integer

But I'm not able to access the match with the name, so it seems like a useless implementation. Am I missing something?

Upvotes: 4

Views: 4816

Answers (1)

Paul Mucur
Paul Mucur

Reputation: 208

Looking at your code output, it seems that you are using the Ruby re2 gem which I maintain.

As of the latest release (0.2.0), the gem does not support the underlying C++ re2 library's named capturing groups. The error you are seeing is due to the fact that any non-integer argument passed to MatchData#[] will simply be forwarded onto the default Array#[]. You can confirm this in an irb session like so:

irb(main):001:0> a = [1, 2, 3]
=> [1, 2, 3]
irb(main):002:0> a["bob"]
TypeError: can't convert String into Integer
    from (irb):2:in `[]'
    from (irb):2
    from /Users/mudge/.rbenv/versions/1.9.2-p290/bin/irb:12:in `<main>'
irb(main):003:0> a[:bob]
TypeError: can't convert Symbol into Integer
    from (irb):3:in `[]'
    from (irb):3
    from /Users/mudge/.rbenv/versions/1.9.2-p290/bin/irb:12:in `<main>'

I will endeavour to add the ability to reference captures by name as soon as possible and update this answer once a release has been made.

Update: I just released version 0.3.0 which now supports named groups like so:

irb(main):001:0> r = RE2::Regexp.compile("(?P<foo>.+) bla") 
=> #<RE2::Regexp /(?P<foo>.+) bla/>
irb(main):002:0> r = r.match("lalal bla") 
=> #<RE2::MatchData "lalal bla" 1:"lalal">
irb(main):003:0> r[1]
=> "lalal"
irb(main):004:0> r[:foo]
=> "lalal"
irb(main):005:0> r["foo"]
=> "lalal"

Upvotes: 5

Related Questions