devmyb
devmyb

Reputation: 375

Why did my regex not give the desired result

I have a string from which I need to extract specific url that consists of an image extension and the following regex:

ITEMIMAGEURL\d+=(http://.*?)(,|$|\n)

and the string that I've to extract from is:

ITEMIMAGEURL0 = http://images.example.com/xyz/l/dasda/test-image-6af8af8afa9.jpg,
ITEMIMAGEURL1 = http://images.example.com/xyz/l/dasda/test-image-,
ITEMIMAGEURL2 = http://images.example.com/abc/as/test/test-image-abrd23lg9.jpg

My regex works fine but I want to extract only the url with .jpg|.gif or any other image extension so I've tried

ITEMIMAGEURL\d+=(http://.*?(?(?=.[a-zA-Z]{3,4})))(,|$|\n)

But it didn't work as expected

My expected result is

http://images.example.com/xyz/l/dasda/test-image-6af8af8afa9.jpg
http://images.example.com/abc/as/test/test-image-abrd23lg9.jpg

Upvotes: 1

Views: 37

Answers (2)

Oleg Cherednik
Oleg Cherednik

Reputation: 18245

ITEMIMAGEURL\d+=(http:\/(?:\/[\w\.-]+)+\.(?:jpe?g|gif|png),?\s?)?

I think you know basics of RegExp. So one one: (?:\/[\w\.-]+) this is a pattern of valid url path. This is not only valid one, you could choose any you like, e.g. (?:\/[^\s,]+).

Demo

Upvotes: 1

anubhava
anubhava

Reputation: 785156

You can use this regex to extract image URLs:

ITEMIMAGEURL\d+=(http://[^,\s]+?\.(?:jpe?g|gif|png))

RegEx Demo

Your image URL is captured in group #1. This assumes your URL doesn't contain comma character.

If comma is allowed in image URLs then use this regex with negative lookahead:

ITEMIMAGEURL\d+=(http://(?:(?!,ITEMIMAGEURL\d).)+\.(?:jpe?g|gif|png))

RegEx Demo 2

Upvotes: 3

Related Questions