Reputation: 3665
I'm trying to remove a block inside a pair of matching patterns using sed. Given a block like:
<span
class="fxlbc-t1-x-x-172">M<span
class="small-caps">A</span><span
class="small-caps">R</span><span
class="small-caps">S</span></span>
<span
class="fxlbc-t1-x-x-248">R<span
class="small-caps">A</span><span
class="small-caps">I</span><span
class="small-caps">S</span><span
class="small-caps">O</span><span
class="small-caps">N</span></span>
I need to remove the block:
<span
class="fxlbc-t1-x-x-172">M<span
class="small-caps">A</span><span
class="small-caps">R</span><span
class="small-caps">S</span></span>
I'm trying to do that in sed. The first problem I'm hitting when using the N
selector, is the problem of odd vs even lines. I've fixed this by doing this:
sed -i 'N
/.*<span \nclass="fxlbc-t1-x-x-172".*/,/.*class="fxlbc-t1-x-x-248".*/ {
/.*fxlbc-t1-x-x-172.*/d
}' test.html
# Add an empty line
sed -i '1i\ ' test.html
sed -i 'N
/.*<span \nclass="fxlbc-t1-x-x-172".*/,/.*class="fxlbc-t1-x-x-248".*/ {
/.*fxlbc-t1-x-x-172.*/d,
/.*
}' test.html
I'm pretty sure there must be an easier way of doing it, and then I'm stuck with how to properly remove the other lines of the block (without removing the fxlbc-t1-x-x-248
span
line). Any idea?
Upvotes: 0
Views: 234
Reputation: 3665
I was given the answer to my problem by a colleague:
sed -i ':a ; $! { N ; ba } ; $s/\(<span\( \|\n\|\t\)\+class="fxlbc-t1-x-x-172">[^4]\+\)\(<span\( \|\n\|\t\)\+class="fxlbc-t1-x-x-248">\)/\3/g' test.html
It puts the whole file in a buffer, and then does a standard search and replace on the buffered string. I reckon it's very ugly though, but it does the trick.
Upvotes: 1