streetlight
streetlight

Reputation: 5968

Determine if string is encoded HTML via jQuery?

I'm working on a validation script, but I'm running into a very particular issue.

If a user enters a string that happens to be an encoded html character (like & or &), it will output as the character (& in this case). My question is this: is it possible to write a function that detemines if a string is an encoded character? So if the user enters one of the two above options, I want to launch a particular function, and if it's a non-encoded character, I want to do something else.

Is there a way to do this?

Upvotes: 1

Views: 3583

Answers (3)

Cerbrus
Cerbrus

Reputation: 72947

You can check if a string contains encoded characters by comparing the encoded vs decoded lengths:

var string = "Your encoded & decoded string here"

function decode(str){
    return decodeURIComponent(str).replace(/&lt;/g,'<').replace(/&gt;/g,'>');
}

if(string.length == decode(string).length){
    // The string does not contain any encoded html.
}else{
    // The string contains encoded html.
}

Also, this is significantly faster than the jQuery method that was suggested.

Upvotes: 2

James South
James South

Reputation: 10645

Something like this would do it.

function containsEncoded (val){
    var rHTMLEncoded = /&[^\s]*/;

    return rHTMLEncoded.test(val) ;
}


// Usage 
var encoded = containsEncoded("&amp;");

Upvotes: 0

deceze
deceze

Reputation: 522451

By definition, if you do not know whether something is an encoded HTML entity or not you do not know. Either you treat all text coming from a certain source as encoded or not encoded. Why? Because it's all just text. "&amp;" is just text. I meant to write "&amp;" here. I do not want anyone to interpret it, I want it to appear literally as "&amp;".

How do you know what the user meant? If you're starting to replace user-entered text based on guesses, you'll always screw it up in some cases. It's the typical case where all ":D" is replaced by a graphical smilie, which is annoying when you actually wanted to type ":D".

If you want to always preserve exactly what the user entered, always run all user input through an HTML-encoding function which replaces all special characters with entities. See The Great Escapism (Or: What You Need To Know To Work With Text Within Text).

Upvotes: 3

Related Questions