Pierrot
Pierrot

Reputation: 13

String#scan not capturing all occurrences

I'm facing a very strange behaviour with ruby String#scan method return. I have this code below and I can't find out why "scan" doesn't return 2 elements.

str = "10011011001"
regexp = "0110"
p str.scan(/(#{regexp})/)

==> [["0110"]]

String "str" clearly contains 2 occurrences of pattern "0110". I want to fetch all the occurences of my regexp in str of course.

Upvotes: 1

Views: 89

Answers (2)

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89574

The reason is that after finding the first result, the regex engine continues its walk at the position after this first result. So the zero at the end of the first result can't be reuse for an other result.

The way to get overlapping results is to put your pattern in a lookahead and in a capture group (a lookahead is only a zero-width assertion (a test) and doesn't consume any characters). In this way the regex engine advance always one character at a time and can test all positions in the string even something is captured in the group:

(?=(yourpattern))

Then your result is in the capture group 1

With your example:

p str.scan(/(?=(0110))/)
[["0110"], ["0110"]]

Upvotes: 5

floum
floum

Reputation: 1159

str = "10011011001"
match = "0110"

str.chars.each_cons(match.size).map(&:join).select { |cons| cons == match }

Should do it.

Upvotes: 1

Related Questions