Is there a native way to encode or decode HTML entities using JavaScript or ES6? For example, < would be encoded as < . There are libraries like html-entities for Node.js but it feels like there should be something built into JavaScript that already handles this common need.

Reputation: 6599

Native JavaScript or ES6 way to encode and decode HTML entities?

Is there a native way to encode or decode HTML entities using JavaScript or ES6? For example, < would be encoded as <. There are libraries like html-entities for Node.js but it feels like there should be something built into JavaScript that already handles this common need.

Upvotes: 44

Answers (7)

The Jared Wilcurt

Reputation: 340

The top answer was kinda unreadable, and unmaintainable. I re-wrote it so all you need to do for tweaking it is to add (or remove) key/value pairs to the map. I also swapped ' with ' because I'm assuming you're using HTML5+ which had wide spread browser adoption by 2011.

/**
 * Escapes special HTML characters.
 *
 * @example
 * '<div title="text">1 & 2</div>'
 * becomes
 * '&lt;div title=&quot;text&quot;&gt;1 &amp; 2&lt;/div&gt;'
 *
 * @param  {string} value  Any input string.
 * @return {string}        The same string, but with encoded HTML entities.
 */
export const escapeHtml = function (value) {
  // https://html.spec.whatwg.org/multipage/named-characters.html
  const namedHtmlEntityMap = {
    '&': '&amp;',
    '<': '&lt;',
    '>': '&gt;',
    '\'': '&apos;',
    '"': '&quot;'
  };
  const charactersToEncode = Object.keys(namedHtmlEntityMap).join('');
  const regexp = new RegExp('[' + charactersToEncode + ']', 'g');
  const encode = function (character) {
    return namedHtmlEntityMap[character];
  };

  return value.replace(regexp, encode);
};

Upvotes: 0

Kamran Gasimov

Reputation: 1783

Simple htmlEncode and htmlDecode

HTML Encode Function

  function encodeHtml(str) {
    let buf = [];

    for (var i = str.length - 1; i >= 0; i--) {
      if (!(/^[a-zA-Z0-9]$/.test(str[i])))
        buf.unshift(['&#', str[i].charCodeAt(), ';'].join(''));
      else
        buf.unshift(str[i])
    }

    return buf.join('');
  }

HTML Decode function

  function decodeHtml(str) {
    return str.replace(/&#(\d+);/g, function(match, dec) {
      return String.fromCharCode(dec);
    });
  }

Upvotes: -1

Ryan - Llaver

Reputation: 526

The reverse (decode) of the answer (encode) @rasafel provided:

const decodeEscapedHTML = (str) =>
  str.replace(
    /&(\D+);/gi,
    (tag) =>
      ({
        '&amp;': '&',
        '&lt;': '<',
        '&gt;': '>',
        '&#39;': "'",
        '&quot;': '"',
      }[tag]),
  )

Upvotes: 1

asafel

Reputation: 811

A nice function using es6 for escaping html:

const escapeHTML = str => str.replace(/[&<>'"]/g, 
  tag => ({
      '&': '&amp;',
      '<': '&lt;',
      '>': '&gt;',
      "'": '&#39;',
      '"': '&quot;'
    }[tag]));

Upvotes: 34

Code Guru

Reputation: 15578

To unescape HTML entities, Your browser is smart and will do it for you

Way1

_unescape(html: string) :string { 
   const divElement = document.createElement("div");
   divElement.innerHTML = html;
   return divElement.textContent || tmp.innerText || "";
}

Way2

_unescape(html: string) :string {
     let returnText = html;
     returnText = returnText.replace(/&nbsp;/gi, " ");
     returnText = returnText.replace(/&amp;/gi, "&");
     returnText = returnText.replace(/&quot;/gi, `"`);
     returnText = returnText.replace(/&lt;/gi, "<");
     returnText = returnText.replace(/&gt;/gi, ">");
     return returnText;
}

You can also use underscore or lodash's unescape method but this ignores   and handles only &, <, >, ", and ' characters.

Upvotes: 5

KyleMit

Reputation: 29909

Roll Your Own ^{(caveat - use HE instead for most use cases)}

For pure JS without a lib, you can Encode and Decode HTML entities using pure Javascript like this:

let encode = str => {
  let buf = [];

  for (var i = str.length - 1; i >= 0; i--) {
    buf.unshift(['&#', str[i].charCodeAt(), ';'].join(''));
  }

  return buf.join('');
}

let decode = str => {
  return str.replace(/&#(\d+);/g, function(match, dec) {
    return String.fromCharCode(dec);
  });
}

Usages:

encode("Hello > © <") // "&#72;&#101;&#108;&#108;&#111;&#32;&#62;&#32;&#169;&#32;&#60;"
decode("Hello &gt; &copy; &#169; &lt;") // "Hello &gt; &copy; © &lt;"

However, you can see this approach has a couple shortcomings:

It encodes even safe characters H → H
It can decode numeric codes (not in the astral plane), but doesn't know anything about full list of html entities / named character codes supported by browsers like >

Use the HE Library (Html Entities)

Support for all standardized named character references
Support for unicode
Works with ambiguous ampersands
Written by Mathias Bynens

Usage:

he.encode('foo © bar ≠ baz 𝌆 qux'); 
// Output : 'foo &#xA9; bar &#x2260; baz &#x1D306; qux'

he.decode('foo &copy; bar &ne; baz &#x1D306; qux');
// Output : 'foo © bar ≠ baz 𝌆 qux'

Native JavaScript or ES6 way to encode and decode HTML entities?

Answers (7)

Simple htmlEncode and htmlDecode

Roll Your Own ^{(caveat - use HE instead for most use cases)}

Use the HE Library (Html Entities)

Related Questions

Related Questions

Native JavaScript or ES6 way to encode and decode HTML entities?

Answers (7)

Simple htmlEncode and htmlDecode

Roll Your Own (caveat - use HE instead for most use cases)

Use the HE Library (Html Entities)

Related Questions

Related Questions

Roll Your Own ^{(caveat - use HE instead for most use cases)}