Mr. Jo
Mr. Jo

Reputation: 5271

How can I exclude a class inside a regex string?

I'm currently trying to build a regex which replaces all HTML tags inside a string, excluding a special element. The problem is that I've found no way excluding the closing tag of the special element also. This is my code:

let str = 'You have to pay <div class="keep-this">$200</div> per <span class="date">month</span> for your <span class="vehicle">car</span> <div class="keep-this">$500</div> also';

console.log(str.replace(/(?!<div class="keep-this">)(<\/?[^>]+(>|$))/g, ""));

How can I fix this?

Upvotes: 1

Views: 757

Answers (2)

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521914

Try this option, which matches all HTML tags, excluding those tags which have the attribute class="keep-this".

let str = 'You have to pay <input class="some-class"/> blah <div class="keep-this">$200</div> per <span class="date">month</span> for your <span class="vehicle">car</span> <div class="keep-this">$500</div> also';

console.log(str.replace(/<\s*([^\s>]+)(?:(?!\bclass="keep-this")[^>])*>(.*?)(?:<\/\1>)|<\s*([^\s>]+)(?:(?!\bclass="keep-this")[^>])*\/>/g, "$2"));

Here is an explanation of the regex pattern:

<                                 match < of an opening tag
\s*                               optional whitespace
([^\s>]+)                         match and capture the HTML tag name in $1 (\1)
(?:(?!\bclass="keep-this")[^>])*  match remainder of tag,
                                  so long as class="keep-this" is not seen
>                                 match > of an opening tag
(.*?)                             match and capture the tag's content in $2,
                                  until hitting the nearest
(?:<\/\1>)                        closing tag, which matches the opening one
|                                 OR
<\s*([^\s>]+)                     match a standalone tag e.g. <input/>
(?:(?!\bclass="keep-this")[^>])*  without a closing tag
\/>                               which matches                            

Then, we simply replace all such matches with empty string, to effectively remove them.

Upvotes: 3

The fourth bird
The fourth bird

Reputation: 163447

If you want to remove all the html elements that do not have the class keep-this you might also make use of DOMParser and for example use :not.

let str = 'You have to pay <div class="keep-this">$200</div> per <span class="date">month</span> for your <span class="vehicle">car</span> <div class="keep-this">$500</div> also';
let parser = new DOMParser();
let doc = parser.parseFromString(str, "text/html");
doc.querySelectorAll("body *:not(.keep-this)").forEach(e => e.replaceWith(e.innerHTML));
console.log(doc.body.innerHTML);

Upvotes: 3

Related Questions