Reputation: 158
My program retrieves an HTML page contents and then looks for jpg images links in the page.
I want to use regular expressions to catch the images however I fot a problem here..
In order to do that I used the pattern
"http.*?jpg"
but that brings me to catch expressions like: "http://someURL...http://imageURL.jpg"
so I guess what i want to find is the shortest match, i.e. find "jpg" and look backwards to the first "http"
Is it possible using regex?
Upvotes: 2
Views: 2938
Reputation: 786261
How about using a negative lookahead based regex to make sure shortest text is matched between http://
and .jpg` like this:
/http:\/\/(?!.*?http:\/\/).+?\.jpe?g/
Upvotes: 0
Reputation: 4744
try http:[^:]*?jpg
which is a hacky way to make sure you only have one semicolon, and therefore only one http: block, you can further select out common delimiters for urls
http:[^:\"\}\{\s]*?\.jpg
Upvotes: 3