user3815508
user3815508

Reputation: 399

Deleting HTML-Blocks with Regular-Expression

I try to delete all HTML-Blocks which are closed. I mean e.g. the following block is to delete, since it is closed <> ... </>

<b> some text </b>

But if it isn't closed (it lacks </>) , then it won't be to delete. Below is a snippet of HTML-Code which is to process:

<div id="MyDiv">div,
  <strong>
    <span>span2, </span>                    <-- This is to delete
        <em> Some text for em
        <div> Some text for div </div>      <-- This is to delete
          <p><b>b,  <span id="MySpan"> Some text for span ... 

After processing it should look like something as follows:

<div id="MyDiv">div,
  <strong>
        <em> Some text for em
          <p><b>b,  <span id="MySpan">span1,

I need a regular-expression statement to acomplish it. E.g. something as follows:

var sHTML = $('#MyDiv').html();
sHTML = sHTML.replace(/^<.*>.*?<\/.*>/ig, '');

Thanks in advance.

Upvotes: 0

Views: 174

Answers (3)

celeritas
celeritas

Reputation: 2281

Assuming your html is in a file called test.html, here's a perl one-liner:

perl -pi -e 's/<.*>.*<\/.*>//g' test.html

Upvotes: 0

vks
vks

Reputation: 67968

<([^>]*)>[^><]*<\/\s*\1\s*>|<(\w+)\s+[^>]*>[^><]*<\/\s*\2\s*>

Try this.Replace by ``.

See demo.

http://regex101.com/r/hQ1rP0/79

Upvotes: 1

Vajura
Vajura

Reputation: 1132

Nvm this works for every case or i am pretty sure it should

(<[^>]*>[^<]*<[^>]*>)

Upvotes: 0

Related Questions