Mark
Mark

Reputation: 3859

Pattern only matches once, when multiple matches are in the string

I am trying to match image URLs which are enclosed in URL tags as follows

[URL=http://www.google.com/sdaasd/sadasda/asddsa/sadsa/dasd.html][IMG]http://www.cnn.com/asd.jpg[/IMG][/URL] 

I have the following pattern which works perfectly when only matched against a single instance of a URL/IMG combo

\[URL=("|)([\s\S]*?)("|)]\[img\](https?:\/\/.*\.(?:png|jpg))\[\/img]\[\/URL\]

HOWEVER, If I repeat the URL/IMAGE combo as follows:

[URL=http://www.google.com/sdaasd/sadasda/asddsa/sadsa/dasd.html][IMG]http://www.cnn.com/asd.jpg[/IMG][/URL] [URL=http://www.google.com/sdaasd/sadasda/asddsa/sadsa/dasd.html][IMG]http://www.cnn.com/asd.jpg[/IMG][/URL]

Then it no longer works. Any ideas on a workaround/fix??

Upvotes: 0

Views: 60

Answers (3)

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89565

A quick fix that you can do is to use a lazy quantifier instead of a greedy quantifier. In other words replace .* by .*?

You can use a more efficient pattern that avoid the lazy quantifier, example:

$pattern ='~\[URL=([^]]*+)]\[IMG]([^[]*+)\[/IMG]\[/URL]~';

Upvotes: 1

Kevin
Kevin

Reputation: 56099

Your .* is matching as much as possible, including ][IMG]. you can avoid this by excluding ]: [^]]*

Upvotes: 1

p.s.w.g
p.s.w.g

Reputation: 149020

My guess is you need to modify the .* to use a non-greedy quantifier, .*?, like this:

\[URL=("|)([\s\S]*?)("|)]\[img\](https?:\/\/.*?\.(?:png|jpg))\[\/img]\[\/URL\]

Upvotes: 1

Related Questions