baby boom
baby boom

Reputation: 158

Regular Expressions find shortest match?

My program retrieves an HTML page contents and then looks for jpg images links in the page.

I want to use regular expressions to catch the images however I fot a problem here..

In order to do that I used the pattern

"http.*?jpg"

but that brings me to catch expressions like: "http://someURL...http://imageURL.jpg"

so I guess what i want to find is the shortest match, i.e. find "jpg" and look backwards to the first "http"

Is it possible using regex?

Upvotes: 2

Views: 2938

Answers (2)

anubhava
anubhava

Reputation: 786261

How about using a negative lookahead based regex to make sure shortest text is matched between http:// and .jpg` like this:

/http:\/\/(?!.*?http:\/\/).+?\.jpe?g/

Upvotes: 0

Hans Z
Hans Z

Reputation: 4744

try http:[^:]*?jpg which is a hacky way to make sure you only have one semicolon, and therefore only one http: block, you can further select out common delimiters for urls

http:[^:\"\}\{\s]*?\.jpg

Upvotes: 3

Related Questions