Jim D
Jim D

Reputation: 31

What is the advantage of using .innerHTML for escaping characters?

I'm trying to figure out how to use escape characters in JS/HTML but I can't figure out how to do it. I've seen examples of .innerHTML being used but I don't understand how. Can someone please explain it in simple terms?

Upvotes: 1

Views: 144

Answers (1)

Pointy
Pointy

Reputation: 414036

If you add content as raw text (like, as the value of a text node), and then query the .innerHTML of the container, you get back escaped HTML because that's what it'd have to look like if you were to set the .innerHTML:

var d = document.createElement('span');
var t = document.createTextNode("<b>Hello World</b>");
d.appendChild(t);
console.log(d.innerHTML); // logs &lt;b&gt;Hello World&lt;/b&gt;

It's just the way that the .innerHTML mechanism behaves.

According to the MDN documentation, the only characters that are affected are <, >, and &. There are times when it's useful to encode other characters with HTML entities. The most common situation I think is when you want to use quotes in an HTML attribute.

An alternative to using the browsers DOM behavior is to use your own JavaScript function. Here's a (slightly modified) version of the code use in the doT template library:

    function encodeHTMLSource() {
      var encodeHTMLRules = { 
          "&": "&#38;", "<": "&#60;", ">": "&#62;", '"': '&#34;', "'": '&#39;', "/": '&#47;'
        },
        matchHTML = /&(?!#?\w+;)|<|>|"|'|\//g;

        return function() {
          return this ? this.replace(matchHTML, function(m) {
            return encodeHTMLRules[m] || m;
          }) : this;
        };
    }
    String.prototype.encodeHTML = encodeHTMLSource();

This function is designed to be added to the String prototype, which some might find distasteful (that seems to be a recent change; my older version doesn't do this). The idea is that it uses a closure to keep a mapping from the "naughty" characters to their HTML entity equivalents, as well as a regular expression to find characters to convert. Once you've done the above, you can escape any string with:

  var escaped = "<b>Hello World</b>".encodeHTML();

The regular expression is written such that it avoids re-encoding existing HTML entities.

Upvotes: 3

Related Questions