Reputation: 7127
var str = 'let us pretend that this is a blog about gardening&cooking; here's an apostrophe & ampersand just for fun.';
This is the string I'm operating on. The desired end result is: "let us pretend that this is a blog about gardening&cooking; here's an apostrophe & ampersand just for fun."
console.log('Before: ' + str);
str = str.replace(/&(?:#x?)?[0-9a-z]+;?/gi, function(m){
var d = document.createElement('div');
console.log(m);
d.innerHTML = m.replace(/&/, '&');
console.log(d.innerHTML + '|' + d.textContent);
return !!d.textContent.match(m.replace(/&/, '&')[0]) ? m : d.textContent;
});
console.log('After: ' + str);
Upvotes: 1
Views: 148
Reputation: 664548
This should do what you want:
str.replace(/&([#x]\d+;|[a-z]+;)/g, "&$1")
or, with a positive lookahead:
str.replace(/&(?=[#x]\d+;|[a-z]+;)/g, "&")
I don't think you need any HTML2text en-/decoding.
Upvotes: 0
Reputation: 7471
The problem is that HTML doesn't support XML's '
To avoid the issue you should use '
instead of '
For more information look at this post:
Why shouldn't '
be used to escape single quotes?
Upvotes: 1