Maarten Heideman
Maarten Heideman

Reputation: 595

Javascript replace regex all html tags except p,a and img

I'm trying to remove all html tags except p, a and img tags. Right now I have:

content.replace(/(<([^>]+)>)/ig,""); 

But this removes all HTML tags.

This are examples of the content of the api:

    <table id="content_LETTER.BLOCK9" border="0" width="100%" cellspacing="0" cellpadding="0" bgcolor="#F7EBF5">
<tbody><tr><td class="ArticlePadding" colspan="1" rowspan="1" align="left" valign="top"><div>what is the opposite of...[] rest of text

Upvotes: 8

Views: 9961

Answers (3)

LanreSmith
LanreSmith

Reputation: 161

var input = 'b<p on>b <p>good p</p> a<a>a h1<h1>h1 p<pre>p p</p onl>p img<img src/>img';
var output = input.replace(/(<(?!\/?((a|img)(\s+[^>]+)*|p)\s*>)([^>]+)>)/ig, '');
console.log(output);
output: bb <p>good p</p> a<a>a h1h1 pp pp img<img src/>img

And if you'd like to remove JS event handler attributes:

var input = 'b<p on>b <p>good p</p> a<a>a h1<h1>h1 p<pre>p p</p onl>p img<img src="y.gif" /> see <img src="x.png" onerror alt="cat" /> there';
var output = input.replace(/(<(?!\/?((a|img)(\s+((?!on)[^>])+)*|p)\s*>)([^>]+)>)/ig, '');
console.log(output);
output: bb <p>good p</p> a<a>a h1h1 pp pp img<img src="y.gif" /> see  there

Upvotes: 0

Dmitry Egorov
Dmitry Egorov

Reputation: 9650

You may match the tags to keep in a capture group and then, using alternation, all other tags. Then replace with $1:

(<\/?(?:a|p|img)[^>]*>)|<[^>]+>

Demo: https://regex101.com/r/Sm4Azv/2

And the JavaScript demo:

var input = 'b<body>b a<a>a h1<h1>h1 p<p>p p</p>p img<img />img';
var output = input.replace(/(<\/?(?:a|p|img)[^>]*>)|<[^>]+>/ig, '$1');
console.log(output);

Upvotes: 14

degant
degant

Reputation: 4981

You can use the below regex to remove all HTML tags except a, p and img:

<\/?(?!a)(?!p)(?!img)\w*\b[^>]*>

Replace with an empty string.

var text = '<tr><p><img src="url" /> some text <img another></img><div><a>blablabla</a></div></p></tr>';
var output = text.replace(/<\/?(?!a)(?!p)(?!img)\w*\b[^>]*>/ig, '');
console.log(output);

Regex 101 Demo

Upvotes: 8

Related Questions