Reputation: 375
I have a string from which I need to extract specific url that consists of an image extension and the following regex:
ITEMIMAGEURL\d+=(http://.*?)(,|$|\n)
and the string that I've to extract from is:
ITEMIMAGEURL0 = http://images.example.com/xyz/l/dasda/test-image-6af8af8afa9.jpg,
ITEMIMAGEURL1 = http://images.example.com/xyz/l/dasda/test-image-,
ITEMIMAGEURL2 = http://images.example.com/abc/as/test/test-image-abrd23lg9.jpg
My regex works fine but I want to extract only the url with .jpg|.gif
or any other image extension so I've tried
ITEMIMAGEURL\d+=(http://.*?(?(?=.[a-zA-Z]{3,4})))(,|$|\n)
But it didn't work as expected
My expected result is
http://images.example.com/xyz/l/dasda/test-image-6af8af8afa9.jpg
http://images.example.com/abc/as/test/test-image-abrd23lg9.jpg
Upvotes: 1
Views: 37
Reputation: 18245
ITEMIMAGEURL\d+=(http:\/(?:\/[\w\.-]+)+\.(?:jpe?g|gif|png),?\s?)?
I think you know basics of RegExp. So one one: (?:\/[\w\.-]+)
this is a pattern of valid url path. This is not only valid one, you could choose any you like, e.g. (?:\/[^\s,]+)
.
Upvotes: 1
Reputation: 785156
You can use this regex to extract image URLs:
ITEMIMAGEURL\d+=(http://[^,\s]+?\.(?:jpe?g|gif|png))
Your image URL is captured in group #1. This assumes your URL doesn't contain comma character.
If comma is allowed in image URLs then use this regex with negative lookahead:
ITEMIMAGEURL\d+=(http://(?:(?!,ITEMIMAGEURL\d).)+\.(?:jpe?g|gif|png))
Upvotes: 3