jorgen
jorgen

Reputation: 1257

Use preg_match to find if string contains script-tags

How do I write a pattern to use with PHP's preg_match function to check if a string containing script-tags?

Upvotes: 3

Views: 3222

Answers (3)

Kragen Javier Sitaker
Kragen Javier Sitaker

Reputation: 1177

For security reasons? Basically, you can't. Here are some things I learned doing this in the past:

  • <a href="javascript:something">...</a>
  • <p onmouseover="something">
  • There are a number of URL schemes that are equivalent to javascript: in different browsers, like jscript:, mocha:, and livescript:. Most are undocumented.
  • Old versions of Netscape treated certain bytes (0x94 and 0x95, I think?) as equivalent to <>. Hopefully there's nothing like this in modern browsers.
  • VBScript.

MySpace tried to do this, and the result was the "Samy is my hero" worm which took down the service for a day or so, among numerous other security disasters on their part.

So if you want to accept a limited subset of HTML that only includes text and formatting, you have to whitelist, not blacklist. You have to whitelist tags, attributes, and if you want to allow links, URL schemes. There are a few existing libraries out there for doing this, but I don't know which ones to recommend in PHP.

Upvotes: 4

soulmerge
soulmerge

Reputation: 75704

Don't use regular expressions for processing xml/html. You should rather use the DOM classes of PHP, it should be much more reliable than any regex you will find:

$document = new DOMDocument();
$document->loadHtml($html);
$xpath = new DOMXPath($document);
if ($xpath->query('//script')->length > 0) {
    // document contains script tags
}

Upvotes: 1

Cem Kalyoncu
Cem Kalyoncu

Reputation: 14593

Are you trying to escape them? if so try the following (not tested)

$string=str_replace(array("&", "<", ">"), array("&amp;", "&lt;", "&gt;"), $string);

With this way a surprise will be waiting your attackers.

Upvotes: 0

Related Questions