Reputation: 111
For a small one page CMS I want to replace script
and other tags that people could use for bad intentions.
I've tried out strip_tags
and preg_replace
but it's not working for some reason.
The one page CMS has 6 fields to edit. Those are saved in a plain text file.
When I edit one of those, I need it to remove all tags like script
, embed
, object
, <iframe>
and others.
I've checked out HTML Purifier, but I don't get it how this should work as I'm not well known with php. Looks a bit too big for my needs I guess.
This is the code (here I try to remove tags from the script
tag from text area named newscontent
):
<?php
if (isset($_POST['edit'])) {
$newscontent = preg_replace('/<script.+?<\/script>/im', '', $newscontent);
if (file_put_contents('title.txt', utf8_encode($_POST['title'])) !== FALSE &&
file_put_contents('subtitle.txt', utf8_encode($_POST['subtitle'])) !== FALSE &&
file_put_contents('datum.txt', utf8_encode($_POST['datum'])) !== FALSE &&
file_put_contents('time.txt', utf8_encode($_POST['time'])) !== FALSE &&
file_put_contents('timemin.txt', utf8_encode($_POST['timemin'])) !== FALSE &&
file_put_contents('newscontent.txt', utf8_encode($_POST['newscontent'])) !== FALSE
)
echo '<p class="succes">Your changes are saved</p>', "\n";
}
$title = utf8_decode(file_get_contents('title.txt'));
$subtitle = utf8_decode(file_get_contents('subtitle.txt'));
$datum = utf8_decode(file_get_contents('datum.txt'));
$time = utf8_decode(file_get_contents('time.txt'));
$timemin = utf8_decode(file_get_contents('timemin.txt'));
$newscontent = utf8_decode(file_get_contents('newscontent.txt'));
?>
Upvotes: 0
Views: 394
Reputation: 33148
Your code doesn't work because you are performing the replacement on the variable $newscontent
, but writing $_POST['newscontent']
to the file. I guess you have register globals switched on (which is bad) or this would generate an error.
I would recommend you persevere with HTMLPurifier. There are many, many bad things people could add to text if they have 'bad intentions', and your approach does not even scratch the surface. For example, if you were to fix your code, it doesn't prevent people adding something like this:
<img src="http://www.google.com/logo.gif" onload="javascript:bad stuff here" />
not to mention the complications of different character sets.
Upvotes: 3
Reputation: 3822
< is a special character in regex, you need to escape it.
$newscontent = preg_replace('/\<(script|object|embed).+?\<\/\1\>/im', '', $newscontent);
Upvotes: -1