raphink
raphink

Reputation: 3665

Remove specific lines inside a range with sed

I'm trying to remove a block inside a pair of matching patterns using sed. Given a block like:

                            <span 
class="fxlbc-t1-x-x-172">M<span 
class="small-caps">A</span><span 
class="small-caps">R</span><span 
class="small-caps">S</span></span>
                               <span 
class="fxlbc-t1-x-x-248">R<span 
class="small-caps">A</span><span 
class="small-caps">I</span><span 
class="small-caps">S</span><span 
class="small-caps">O</span><span 
class="small-caps">N</span></span>

I need to remove the block:

                            <span 
class="fxlbc-t1-x-x-172">M<span 
class="small-caps">A</span><span 
class="small-caps">R</span><span 
class="small-caps">S</span></span>

I'm trying to do that in sed. The first problem I'm hitting when using the N selector, is the problem of odd vs even lines. I've fixed this by doing this:

sed -i 'N
 /.*<span \nclass="fxlbc-t1-x-x-172".*/,/.*class="fxlbc-t1-x-x-248".*/ {
   /.*fxlbc-t1-x-x-172.*/d
   }' test.html

# Add an empty line
sed -i '1i\ ' test.html

sed -i 'N
 /.*<span \nclass="fxlbc-t1-x-x-172".*/,/.*class="fxlbc-t1-x-x-248".*/ {
   /.*fxlbc-t1-x-x-172.*/d,
   /.*
   }' test.html

I'm pretty sure there must be an easier way of doing it, and then I'm stuck with how to properly remove the other lines of the block (without removing the fxlbc-t1-x-x-248 span line). Any idea?

Upvotes: 0

Views: 234

Answers (1)

raphink
raphink

Reputation: 3665

I was given the answer to my problem by a colleague:

sed -i ':a ; $! { N ; ba } ; $s/\(<span\( \|\n\|\t\)\+class="fxlbc-t1-x-x-172">[^4]\+\)\(<span\( \|\n\|\t\)\+class="fxlbc-t1-x-x-248">\)/\3/g' test.html

It puts the whole file in a buffer, and then does a standard search and replace on the buffered string. I reckon it's very ugly though, but it does the trick.

Upvotes: 1

Related Questions