user1234
user1234

Reputation: 3159

How to remove content within the &lt and &gt javascript

I have a content that contains a string of elements along with images. ex:

var str= <p>&lt;img src=\"v\"&gt;fwefwefw&lt;/img&gt;</p><p><br></p><p><br></p>

the text that is within the &lt and &gt is a dirty tag and I would like to remove it along with the content that is within it. the tag is generated dynamically and hence could be any tag i.e <div>, <a>, <h1> etc....

the expected output : <p></p><p><br></p><p><br></p>

however with this code, im only able to remove the tags and not the content inside it.

str.replaceAll(/&lt;.*?&gt;/g, "");

it renders like this which is not what im looking for:

<p>fwefwefw</p><p><br></p><p><br></p><p><br></p>

how can I possibly remove the & tags along with the content so that I get rid of dirty tags and text inside it?

fiddle: https://jsfiddle.net/3rozjn8m/

thanks

Upvotes: 0

Views: 1221

Answers (2)

Kyle Noll
Kyle Noll

Reputation: 84

Remove the question mark.

var str= "<p>&lt;img src=\"v\"&gt;fwefwefw&lt;/img&gt;</p><p><br></p><p><br></p>";
console.log(str.replaceAll(/&lt;.*&gt;/g, ""));

Upvotes: 0

trincot
trincot

Reputation: 350300

A safe way is to use a DOM parser, visiting each text node, where then each text can be cleaned separately. This way you are certain the DOM structure is not altered; only the texts:

let str= "<p>&lt;img src=\"v\"&gt;fwefwefw&lt;/img&gt;</p><p><br></p><p><br></p>";

let doc = new DOMParser().parseFromString(str, "text/html");
let walk = doc.createTreeWalker(doc.body, 4, null, false);
let node = walk.nextNode();
while (node) {
    node.nodeValue = node.nodeValue.replace(/<.*>/gs, "");
    node = walk.nextNode();
}
let clean = doc.body.innerHTML;

console.log(clean);

This will also work when you have more than one <p> element that has such content.

Upvotes: 2

Related Questions