Reputation: 357
I have lots of characters in the form ¶
which I would like to display as unicode characters in my text editor.
This ought to convert them:
var newtext = doctext.replace(
/&#(\d+);/g,
String.fromCharCode(parseInt("$1", 10))
);
But doesn't seem to work. The regular expression /&#(\d+);/
is getting me the numbers out -- but the String.fromCharCode
does not appear to give the results I'd like. What is up?
Upvotes: 2
Views: 1236
Reputation: 198436
The replace method is not foolproof, if you use full HTML (i.e. don't control what the input is). For example, the method submitted by Jack (and obviously the idea in the original post as well) works excellently if your entities are all decimal, but doesn't work for hex A
, and even less for named entities like "
.
For this, there is another trick you can do: create an element, set its innerHTML to the source, then read out its text value. Basically, browsers know what to do with entities, so we delegate. :) In jQuery it is easy:
$('<div/>').html('&').text()
// => "&"
With plain JS it gets a bit more verbose:
var el = document.createElement();
el.innerHTML = '&';
el.textContent
// => "&"
Upvotes: 2
Reputation: 173642
The replacement part should be an anonymous function instead of an expression:
var newtext = doctext.replace(
/&#(\d+);/g,
function($0, $1) {
return String.fromCharCode(parseInt($1, 10));
}
);
Upvotes: 6