JDelage
JDelage

Reputation: 13682

How to restrict or limit the html tags a user can enter in a web form, pref. client side?

What are good options to restrict the type of html tags a user is allowed to enter into a form field? I'd like to be able to do that client side (presumably using JavaScript), server-side in PHP if it's too heavy for the user's browser, and possibly a combo of both if appropriate.

Effectively I'd like users to be able to submit data with the same tag-set as on Stackoverflow, plus maybe the standard MathML tags. The form must accept UTF-8 text, including Asian ideograms, etc.

In the application, the user must be able to submit text-entries with basic html tags, and those entries must be able to be displayed to (potentially different) users with the html rendered correctly in a way that is safe to the users. I'm planning to use htmlspecialchars() and htmlspecialchars_decode() to protect my db server-side.

Many thanks,

JDelage

PS: I searched but couldn't find this question...

Upvotes: 3

Views: 2555

Answers (4)

Eugene
Eugene

Reputation: 4389

I had similar problem for some time. There were some $%^&*) who liked to post some comments like <script>alert('Hello');</script> or something like that. I got tired and made a small function, which helped me, to allow, only <br> or <br /> tags for normal view of message. I did it only in PHP, but I think it might help you.

function eliminateTags($msg) {
    $setBrakes = nl2br($msg);
    $decodeHTML = htmlspecialchars_decode($setBrakes);

    # Check PHP version
    if(version_compare(PHP_VERSION, '5.2') == 1) {
        $withoutTags = strip_tags($decodeHTML, "<br />");
    } else {
        $withoutTags = strip_tags($decodeHTML, "<br>");
    }
    return $withoutTags;
}

Upvotes: 0

Ionuț Staicu
Ionuț Staicu

Reputation: 22184

I think is way easy to use strip_tags and just specify the tags you are allowing.

Upvotes: 1

OdinX
OdinX

Reputation: 4211

You could do something like this, if you are familiar with regular expressions:

<?php

function parse($string)
{
//To stop unwanted HTML tags being used
$string = str_replace("<","&lt;",$string); //Replace all < with the HTML equiv
$string = str_replace(">","&gt;",$string); //Replace all > with the HTML equiv

$find = array(
"%\*\*\*(.+?)\*\*\*%s", //Search for ***any string here***
"%`(.+?)`%s",           //Search for `any string here`
);

$replace = array(
"<b>\\1</b>",                                          //Replace with <b>any string here</b>
"<span style=\"background-color: #DDDDDD\">\\1</span>" //Replace with <span style="background-color: #DDDDDD">any string here</span>
);

$string = preg_replace($find,$replace,$string); //Do the find and replace
return $string; //Return the output
}

echo parse("***Hello*** `There` <b>Friend</b>");
?>

Outputs:

Hello There <b>Friend</b>

Upvotes: 0

Pekka
Pekka

Reputation: 449613

If you're looking to filter input agains XSS attacks etc., consider using an existing library like HTML Purifier. I've not used it myself yet but it promises a lot and is in high regard.

HTML Purifier is a standards-compliant HTML filter library written in PHP. HTML Purifier will not only remove all malicious code (better known as XSS) with a thoroughly audited, secure yet permissive whitelist, it will also make sure your documents are standards compliant, something only achievable with a comprehensive knowledge of W3C's specifications.

Upvotes: 3

Related Questions