Reputation: 1473
Can someone please explain the difference between .+
and .+?
I have the string: "extend cup end table"
e.+d
finds: extend cup end
e.+?d
finds: extend
and end
I know that +
is one or more and ?
is one or zero.
But I am not able to understand how does it work.
Upvotes: 49
Views: 30770
Reputation: 500277
Both will match any sequence of one or more characters. The difference is that:
.+
is greedy and consumes as many characters as it can..+?
is reluctant and consumes as few characters as it can.See Differences Among Greedy, Reluctant, and Possessive Quantifiers in the Java tutorial.
Thus:
e.+d
finds the longest substring that starts with e
and ends with d
(and contains at least one character in between). In your example extend cup end
will be found.e.+?d
find the shortest such substring. In your example, extend
and end
are two such non-overlapping matches, so it finds both.Upvotes: 67
Reputation: 170148
The regex e.+?d
matches an 'e'
and then tries to match as few characters as possible (ungreedy or reluctant), followed by a 'd'
. That is why the following 2 substrings are matched:
extend cup end table
^^^^^^ ^^^
1 2
The regex e.+d
matches an 'e'
and then tries to match as much characters as possible (greedy), followed by a 'd'
. What happens is that the first 'e'
is found, and then the .+
matches as much as it can (till the end of the line, or input):
extend cup end table
^^^^^^^^^^^^^^^^^^^^
The regex engine comes to the end of the line (or input) and can't match the 'd'
in the regex-pattern. So it backtracks to the last 'd'
is saw. That is why the single match is found:
extend cup end table
^^^^^^^^^^^^^^<----- backtrack
1
Upvotes: 18