KF4
KF4

Reputation: 11

Matching sets of tags in PHP with Regular Expression

I am currently working on protecting my AJAX Chat against exploits by checking all text in PHP before it is passed to the client. So far I have been successful with my mission except for one part where I require to match sets of image tags.

Overall I wish to have it pick up any instance of there being a newline character between a set tags which I have sort of managed, but the solution I have is greedy and matches newline characters outside of tags as well if there are multiple sets of tags.

At the moment I have the following which works if I wanted to match just [img]{newline}[/img]

if(preg_match('/\[\bimg\].*\x0A.*\[\/\bimg\]/', $text)){ //code }

But if I wanted to do [img]image.jpg[/img]{newline}[img]image.jpg[/img], it only sees the very first and end tags which I do not want.

So now I ask, how do you make it match each set of tags properly?

Edit: For clarification. Any newline characters inside tags are bad, so I want to detect them. Any newline characters outside tags are good and I want to ignore them. The reason being, if the client processes a newline character inside of a tag, it crashes.

Upvotes: 1

Views: 94

Answers (2)

Thiefbrain
Thiefbrain

Reputation: 26

Try setting the s modifier, like this:

if (preg_match('/\[\bimg\].*\x0A.*\[\/\bimg\]/s', $text)) { code }

See also the PHP Documentation for Regex modifiers

Upvotes: 0

Niet the Dark Absol
Niet the Dark Absol

Reputation: 324760

Just make it ungreedy by putting ? after the two .*

But note that your current solution will not match this:

[img]
look, two newlines!
[/img]

I'm not sure why you want to do this, but you can make . match newlines by adding the s modifier to your regex. Then it's just "(\[img\](.*?)\[/img\])is" to match it, and you can even capture that group and individually check it for newlines if you want.

Upvotes: 2

Related Questions