Delgan
Delgan

Reputation: 19627

Smallest possible match / nongreedy regex search

I first thought that this answer will totaly solve my issue, but it did not.

I have a string url like this one:

http://www.someurl.com/some-text-1-0-1-0-some-other-text.htm#id_76

I would like to extract some-other-text so basically, I come with the following regex:

/0-(.*)\.htm/

Unfortunately, this matches 1-0-some-other-text because regex are greedy. I can not succeed make it nongreedy using .*?, it just does not change anything as you can see here.

I also tried with the U modifier but it did not help.

Why the "nongreedy" tip does not work?

Upvotes: 4

Views: 238

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626845

In case you need to get the closest match, you can make use of a tempered greedy token.

0-((?:(?!0-).)*)\.htm

See demo

The lazy version of your regex does not work because regex engine analyzes the string from left to right. It always gets leftmost position and checks if it can match. So, in your case, it found the first 0-and was happy with it. The laziness applies to the rightmost position. In your case, there is 1 possible rightmost position, so, lazy matching could not help achieve expected results.

You also can use

0-((?!.*?0-).*)\.htm

It will work if you have individual strings to extract the values from.

Upvotes: 3

Martin Brandl
Martin Brandl

Reputation: 58931

You want to exclude the 1-0? If so, you can use a non capturing group:

(?:1-0-)+(.*?)\.htm

Demo

Upvotes: 0

Related Questions