fabregaszy
fabregaszy

Reputation: 506

Something confusing about non-greedy regex match

I wrote a piece of ruby code like below

  #! /usr/bin/ruby
    s = "[[abc]]"  
    if(s =~ /\[(.+)*?\]/)
        puts $1
    end
    if(s =~ /\[(.+?)\]/)
        puts $1
    end

its output is:

[abc
[abc

then I change variable s

  s = "[[abc]]]"

and the rest part remains the same, but now the result is

[abc]
[abc

Why this happens? Could anyone explain to me about this?

Upvotes: 1

Views: 1776

Answers (1)

stema
stema

Reputation: 93006

I am not sure if someone here will be able to explain this behaviour. I checked with Regexr and there the regex behaves like you are expecting it.

But

\[(.+)*?\]

is just a horribly bad designed expression. What should (.+)* match? Thats nesting quantifiers and it could find a valid match in many variations. Now worse, making the outer quantifier lazy, what should happen?

If you want to have greedy matching use

\[(.+)\]

if you want to have lazy matching, use

\[(.+?)\]

But never nest quantifiers, so that they can find many possible solutions, this leads to catastrophic backtracking, or see here a blog post by Jeff Atwood on Coding Horror about Regex Performance

Upvotes: 3

Related Questions