Reputation: 6599
Is there a native way to encode or decode HTML entities using JavaScript or ES6? For example, <
would be encoded as <
. There are libraries like html-entities
for Node.js but it feels like there should be something built into JavaScript that already handles this common need.
Upvotes: 44
Views: 63359
Reputation: 340
The top answer was kinda unreadable, and unmaintainable. I re-wrote it so all you need to do for tweaking it is to add (or remove) key/value pairs to the map. I also swapped '
with '
because I'm assuming you're using HTML5+ which had wide spread browser adoption by 2011.
/**
* Escapes special HTML characters.
*
* @example
* '<div title="text">1 & 2</div>'
* becomes
* '<div title="text">1 & 2</div>'
*
* @param {string} value Any input string.
* @return {string} The same string, but with encoded HTML entities.
*/
export const escapeHtml = function (value) {
// https://html.spec.whatwg.org/multipage/named-characters.html
const namedHtmlEntityMap = {
'&': '&',
'<': '<',
'>': '>',
'\'': ''',
'"': '"'
};
const charactersToEncode = Object.keys(namedHtmlEntityMap).join('');
const regexp = new RegExp('[' + charactersToEncode + ']', 'g');
const encode = function (character) {
return namedHtmlEntityMap[character];
};
return value.replace(regexp, encode);
};
Upvotes: 0
Reputation: 1783
function encodeHtml(str) {
let buf = [];
for (var i = str.length - 1; i >= 0; i--) {
if (!(/^[a-zA-Z0-9]$/.test(str[i])))
buf.unshift(['&#', str[i].charCodeAt(), ';'].join(''));
else
buf.unshift(str[i])
}
return buf.join('');
}
HTML Decode function
function decodeHtml(str) {
return str.replace(/&#(\d+);/g, function(match, dec) {
return String.fromCharCode(dec);
});
}
Upvotes: -1
Reputation: 526
The reverse (decode) of the answer (encode) @rasafel provided:
const decodeEscapedHTML = (str) =>
str.replace(
/&(\D+);/gi,
(tag) =>
({
'&': '&',
'<': '<',
'>': '>',
''': "'",
'"': '"',
}[tag]),
)
Upvotes: 1
Reputation: 811
A nice function using es6 for escaping html:
const escapeHTML = str => str.replace(/[&<>'"]/g,
tag => ({
'&': '&',
'<': '<',
'>': '>',
"'": ''',
'"': '"'
}[tag]));
Upvotes: 34
Reputation: 15578
To unescape
HTML entities, Your browser is smart and will do it for you
Way1
_unescape(html: string) :string {
const divElement = document.createElement("div");
divElement.innerHTML = html;
return divElement.textContent || tmp.innerText || "";
}
Way2
_unescape(html: string) :string {
let returnText = html;
returnText = returnText.replace(/ /gi, " ");
returnText = returnText.replace(/&/gi, "&");
returnText = returnText.replace(/"/gi, `"`);
returnText = returnText.replace(/</gi, "<");
returnText = returnText.replace(/>/gi, ">");
return returnText;
}
You can also use underscore or lodash's unescape method but this ignores
and handles only &
, <
, >
, "
, and '
characters.
Upvotes: 5
Reputation: 29909
For pure JS without a lib, you can Encode and Decode HTML entities using pure Javascript like this:
let encode = str => {
let buf = [];
for (var i = str.length - 1; i >= 0; i--) {
buf.unshift(['&#', str[i].charCodeAt(), ';'].join(''));
}
return buf.join('');
}
let decode = str => {
return str.replace(/&#(\d+);/g, function(match, dec) {
return String.fromCharCode(dec);
});
}
Usages:
encode("Hello > © <") // "Hello > © <"
decode("Hello > © © <") // "Hello > © © <"
However, you can see this approach has a couple shortcomings:
H
→ H
>
Usage:
he.encode('foo © bar ≠ baz 𝌆 qux');
// Output : 'foo © bar ≠ baz 𝌆 qux'
he.decode('foo © bar ≠ baz 𝌆 qux');
// Output : 'foo © bar ≠ baz 𝌆 qux'
Upvotes: 11
Reputation: 1048
There is no native function in the JavaScript API that convert ASCII characters to their "html-entities" equivalent. Here is a beginning of a solution and an easy trick that you may like
Upvotes: 6