Reputation: 3159
I have a content that contains a string of elements along with images. ex:
var str= <p><img src=\"v\">fwefwefw</img></p><p><br></p><p><br></p>
the text that is within the <
and >
is a dirty tag and I would like to remove it along with the content that is within it. the tag is generated dynamically and hence could be any tag i.e <div>, <a>, <h1> etc...
.
the expected output : <p></p><p><br></p><p><br></p>
however with this code, im only able to remove the tags and not the content inside it.
str.replaceAll(/<.*?>/g, "");
it renders like this which is not what im looking for:
<p>fwefwefw</p><p><br></p><p><br></p><p><br></p>
how can I possibly remove the &
tags along with the content so that I get rid of dirty tags and text inside it?
fiddle: https://jsfiddle.net/3rozjn8m/
thanks
Upvotes: 0
Views: 1221
Reputation: 84
Remove the question mark.
var str= "<p><img src=\"v\">fwefwefw</img></p><p><br></p><p><br></p>";
console.log(str.replaceAll(/<.*>/g, ""));
Upvotes: 0
Reputation: 350300
A safe way is to use a DOM parser, visiting each text node, where then each text can be cleaned separately. This way you are certain the DOM structure is not altered; only the texts:
let str= "<p><img src=\"v\">fwefwefw</img></p><p><br></p><p><br></p>";
let doc = new DOMParser().parseFromString(str, "text/html");
let walk = doc.createTreeWalker(doc.body, 4, null, false);
let node = walk.nextNode();
while (node) {
node.nodeValue = node.nodeValue.replace(/<.*>/gs, "");
node = walk.nextNode();
}
let clean = doc.body.innerHTML;
console.log(clean);
This will also work when you have more than one <p>
element that has such content.
Upvotes: 2