Reputation: 13682
What are good options to restrict the type of html tags a user is allowed to enter into a form field? I'd like to be able to do that client side (presumably using JavaScript), server-side in PHP if it's too heavy for the user's browser, and possibly a combo of both if appropriate.
Effectively I'd like users to be able to submit data with the same tag-set as on Stackoverflow, plus maybe the standard MathML tags. The form must accept UTF-8 text, including Asian ideograms, etc.
In the application, the user must be able to submit text-entries with basic html tags, and those entries must be able to be displayed to (potentially different) users with the html rendered correctly in a way that is safe to the users. I'm planning to use htmlspecialchars()
and htmlspecialchars_decode()
to protect my db server-side.
Many thanks,
JDelage
PS: I searched but couldn't find this question...
Upvotes: 3
Views: 2555
Reputation: 4389
I had similar problem for some time. There were some $%^&*) who liked to post some comments like <script>alert('Hello');</script>
or something like that. I got tired and made a small function, which helped me, to allow, only <br>
or <br />
tags for normal view of message.
I did it only in PHP, but I think it might help you.
function eliminateTags($msg) {
$setBrakes = nl2br($msg);
$decodeHTML = htmlspecialchars_decode($setBrakes);
# Check PHP version
if(version_compare(PHP_VERSION, '5.2') == 1) {
$withoutTags = strip_tags($decodeHTML, "<br />");
} else {
$withoutTags = strip_tags($decodeHTML, "<br>");
}
return $withoutTags;
}
Upvotes: 0
Reputation: 22184
I think is way easy to use strip_tags and just specify the tags you are allowing.
Upvotes: 1
Reputation: 4211
You could do something like this, if you are familiar with regular expressions:
<?php
function parse($string)
{
//To stop unwanted HTML tags being used
$string = str_replace("<","<",$string); //Replace all < with the HTML equiv
$string = str_replace(">",">",$string); //Replace all > with the HTML equiv
$find = array(
"%\*\*\*(.+?)\*\*\*%s", //Search for ***any string here***
"%`(.+?)`%s", //Search for `any string here`
);
$replace = array(
"<b>\\1</b>", //Replace with <b>any string here</b>
"<span style=\"background-color: #DDDDDD\">\\1</span>" //Replace with <span style="background-color: #DDDDDD">any string here</span>
);
$string = preg_replace($find,$replace,$string); //Do the find and replace
return $string; //Return the output
}
echo parse("***Hello*** `There` <b>Friend</b>");
?>
Outputs:
Hello There
<b>Friend</b>
Upvotes: 0
Reputation: 449613
If you're looking to filter input agains XSS attacks etc., consider using an existing library like HTML Purifier. I've not used it myself yet but it promises a lot and is in high regard.
HTML Purifier is a standards-compliant HTML filter library written in PHP. HTML Purifier will not only remove all malicious code (better known as XSS) with a thoroughly audited, secure yet permissive whitelist, it will also make sure your documents are standards compliant, something only achievable with a comprehensive knowledge of W3C's specifications.
Upvotes: 3