holographix
holographix

Reputation: 2557

match all tags after a specific tag

I'm beating this dead horse here:

<p style='margin: 5px 0;'>I wan't be matched!</p>
<p style='margin: 5px 0;'>me 2!</p>
<ul>
    <li>
      <b>Lorem</b>
      ipsum sit dolor amet
    </li>

    <li>
      <b>Lorem</b>
      ipsum sit dolor amet
    </li>

    <li>
      <b>Lorem</b>
      ipsum sit dolor amet
    </li>

    <li>
      <b>Lorem</b>
      ipsum sit dolor amet
    </li>

    <p style='margin: 5px 0;'>can i haz regex</p>
    <p style='margin: 5px 0;'>NO! you can't</p>
    <li>
      <b>Lorem</b>
      ipsum sit dolor amet
    </li>   
<ul>

from that I need a regex that changes all the

    <p style='margin: 5px 0;'>can i haz regex</p>

after the tag and converts it into

    <li>can i haz regex</li>

simple as that, but considering that I'm a real noob at regex, I can't get it done.

I was trying with look behind expression, but with no success

  (?m:(?<=(.*?<ul>.*?)(<p style='margin: 5px 0;'>.*?</p>)+)

it's about 2hrs that I'm trying to figure it out, but I can't really seem to make it work. so thanks in advance to anyone who can explain me how this thing should be settled to work out. :)

Upvotes: 0

Views: 76

Answers (1)

Lev Levitsky
Lev Levitsky

Reputation: 65871

If the lines to change must be between <ul> and </ul>, then you could try something like the following sed command:

sed "/<ul>/,/<\/ul>/ s|<p style='margin: 5px 0;'>\(.*\)</p>|<li>\1</li>|g" test.html

This isn't using only regex in the sense that I specify an address range, too. You really need to be careful using these tools with HTML, though, I agree with the comments. You don't want to depend on whitespaces or how the tags sit on lines, to begin with.

Also, maybe you could tell us what language you are using (if it matters).

Upvotes: 1

Related Questions