Reputation: 12935
I have a <textarea>
where I am allowing user content to be submitted. I would like to allow a couple tags such as <b>
, <i>
,<blockquote>
, and <del>
. However, since the content will be displayed in the page, I have to ensure that there are no unclosed tags.
I know I can use strip_tags($textarea, '<b><i><blockquote><del>')
, but how can I then ensure that all the remaining tags are properly closed?
Upvotes: 5
Views: 949
Reputation: 52802
You really want to use a proper HTML filtering library, such as HTMLPurifier, especially since you're planning to use the submitted content for displaying styles. HTMLPurifier considers both attributes, css and other inline styles to avoid XSS, and will also attempt to make sense (by using Tidy as already suggested) of your HTML and clean up any missing tags (to ensure that the resulting fragment is XHTML compliant).
I don't think that Tidy will attempt to remove any evil XSS segments.
Upvotes: 0
Reputation: 31730
The DOMDocument extension provides an API for manipulating HTML DOM structures and ought to be worth looking at for this.
Upvotes: 0
Reputation: 53931
You could use Tidy. It will clean and sanitize your HTML.
This comment, on php.net, address your problem and shows how to solve it: http://www.php.net/manual/en/tidy.examples.basic.php#89334
Cleaning an html fragment (OO support seems to half-arsed for now)
This will ensure all tags are closed, without adding any html/head/body tags around it.
<?php
$tidy_config = array(
'clean' => true,
'output-xhtml' => true,
'show-body-only' => true,
'wrap' => 0,
);
$tidy = tidy_parse_string($html_fragment, $tidy_config, 'UTF8');
$tidy->cleanRepair();
echo $tidy;
?>
Upvotes: 5