Reputation: 2631
My regular expression is the following (\d+_)*
and the test string is 1_2_3_
. Ruby is matching the string correctly. However the matchdata only returns "3_" as the match.
e.g.
irb(main):004:0> /(\d+_)*/.match("1_2_3_")
=> #<MatchData "1_2_3_" 1:"3_">
I'd expect something like #<MatchData "1_2_3_" 1:"1_", 2:"2_", 3:"3_">
Upvotes: 2
Views: 618
Reputation: 336178
Each new repetition of the group overwrites the previous match. All regex engines work that way. To my knowledge, only the .NET regex engine provides a means to access all the matches of a repeated group (a so-called "capture").
Imagine what's happening. In a regex, every pair of parentheses builds a capturing group; they are numbered from left to right. So in /(\d+_)*/
, (\d+_)
is capturing group number 1.
Now if you apply that regex to 1_2_
, what happens?
(\d+_)
matches 1_
1_
is stored as the contents of the first capturing group. You could now access \1
to see these contents.*
tells the regex engine to retry the match from the current position.(\d+_)
now matches 2_
2_
, again needs to be stored in group number 1/backreference \1
. So it overwrites whatever is in there.To get the desired result in Ruby, you need to do two regex matches: /(?:\d+_)*/
for the overall match and /\d+_/
for each single match:
irb(main):001:0> s = "1_2_3_"
=> "1_2_3_"
irb(main):009:0> s.match(/(?:\d+_)*/)
=> #<MatchData "1_2_3_">
irb(main):007:0> s.scan(/\d+_/)
=> ["1_", "2_", "3_"]
Upvotes: 5
Reputation: 5563
"1_2_3_".scan(/\d+_/) # => ["1_", "2_", "3_"]
will get you what you are looking for. (notice the removal of the *
). I also removed the grouping b/c it simply results in an array of arrays, i.e [["1_"], ["2_"], ["3_"]]
Upvotes: 0
Reputation: 32921
I believe you want .scan
. It'll return an array of the matches.
Upvotes: 0