nakul
nakul

Reputation: 1473

Difference between ".+" and ".+?"

Can someone please explain the difference between .+ and .+?

I have the string: "extend cup end table"

  1. The pattern e.+d finds: extend cup end
  2. The pattern e.+?d finds: extend and end

I know that + is one or more and ? is one or zero. But I am not able to understand how does it work.

Upvotes: 49

Views: 30770

Answers (2)

NPE
NPE

Reputation: 500277

Both will match any sequence of one or more characters. The difference is that:

  • .+ is greedy and consumes as many characters as it can.
  • .+? is reluctant and consumes as few characters as it can.

See Differences Among Greedy, Reluctant, and Possessive Quantifiers in the Java tutorial.

Thus:

  • e.+d finds the longest substring that starts with e and ends with d (and contains at least one character in between). In your example extend cup end will be found.
  • e.+?d find the shortest such substring. In your example, extend and end are two such non-overlapping matches, so it finds both.

Upvotes: 67

Bart Kiers
Bart Kiers

Reputation: 170148

The regex e.+?d matches an 'e' and then tries to match as few characters as possible (ungreedy or reluctant), followed by a 'd'. That is why the following 2 substrings are matched:

extend cup end table
^^^^^^     ^^^
  1         2

The regex e.+d matches an 'e' and then tries to match as much characters as possible (greedy), followed by a 'd'. What happens is that the first 'e' is found, and then the .+ matches as much as it can (till the end of the line, or input):

extend cup end table
^^^^^^^^^^^^^^^^^^^^

The regex engine comes to the end of the line (or input) and can't match the 'd' in the regex-pattern. So it backtracks to the last 'd' is saw. That is why the single match is found:

extend cup end table
^^^^^^^^^^^^^^<----- backtrack
  1      

Upvotes: 18

Related Questions