mahemoff
mahemoff

Reputation: 46429

Ruby: Find first N regex matches in a string (and stop scanning)

Wanting to scan a very long string for regex matches. Wondering what would be the most efficient way to find the first N regex's. e.g. Something like:

'abcabcabc'.scan /b/, limit: 2

would end successfully after 5 characters, if only scan supported a limit option.

(The string is several MB - a memoized data structure in memory - and this is a web request. Perf matters.)

Upvotes: 3

Views: 609

Answers (2)

Tamer Shlash
Tamer Shlash

Reputation: 9523

Fortunately, Ruby regex supports lazy matching, so you can use it like this:

'abcabcabc'.match(/(b).*?(b)/)

Adding ? after .* makes it mach lazily, stopping as soon as the regex has been fulfilled. From the Regexp class repetition documentation:

Repetition is greedy by default: as many occurrences as possible are matched while still allowing the overall match to succeed. By contrast, lazy matching makes the minimal amount of matches necessary for overall success. A greedy metacharacter can be made lazy by following it with ?.

Upvotes: 1

Stefan
Stefan

Reputation: 114188

Not that elegant, but you could use the block form:

str = 'abcabcabc'

result = []
str.scan(/b/) { |match| result << match; break if result.size >= 2 }
result #=> ["b", "b"]

Upvotes: 3

Related Questions