seb
seb

Reputation: 3324

Ruby Regexp: + vs *. special behaviour?

Using ruby regexp I get the following results:

>> 'foobar'[/o+/]
=> "oo"
>> 'foobar'[/o*/]
=> ""

But:

>> 'foobar'[/fo+/]
=> "foo"
>> 'foobar'[/fo*/]
=> "foo"

The documentation says:
*: zero or more repetitions of the preceding
+: one or more repetitions of the preceding

So i expect that 'foobar'[/o*/] returns the same result as 'foobar'[/o+/]

Does anybody have an explanation for that

Upvotes: 7

Views: 220

Answers (2)

reko_t
reko_t

Reputation: 56430

This is a common misunderstanding of how regexp works.

Although the * is greedy and isn't anchored at the start of the string, the regexp engine will still start looking from beginning of the string. In case of "/o+/", it does not match at position 0 (eg. "f"), but since the + means one or more, it has to continue matching (this has nothing to do with the greediness) until a match is found or all positions are evaluated.

However with the case of "/o*/", which as you know mean 0 or more times, when it doesn't match at position 0, the regexp engine will gracefully stop at that point (as it should, because o* simply means that the o is optional). There's also performance reasons, since "o" is optional, why spend more time looking for it?

Upvotes: 3

Gareth
Gareth

Reputation: 138032

'foobar'[/o*/] is matching the zero os that appear before the f, at position 0
'foobar'[/o+/] can't match there because there needs to be at least 1 o, so it instead matches all the os from position 1

Specifically, the matches you are seeing are

'foobar'[/o*/] => '<>foobar'
'foobar'[/o+/] => 'f<oo>bar'

Upvotes: 14

Related Questions