Reputation: 361
I have tried a number of ways to approach this but I'm out of ideas. Hopefully someone out there can point out what I am doing wrong.
Here is my input:
<Root>
<A>Keep</A>
<B>Keep</B>
<B>Remove</B>
<B>Keep</B>
<C>Keep</C>
</Root>
As you can kinda figure out by now, I'm just trying to remove line #4:
<Root>
<A>Keep</A>
<B>Keep</B>
<B>Keep</B>
<C>Keep</C>
</Root>
Here is what I have so far, but it's not quite working as intended:
sed -e '3,${g;s/<B>.*<\/B>//p}' t1
I tried adding part of the group logic that I found around but it's not working as it seems that sed has no direct way of making it greedy.
Any ideas?
Upvotes: 1
Views: 38
Reputation: 92894
Hopefully someone out there can point out what I am doing wrong
The right way is to use XML/HTML parsers like xmlstarlet
or xmllint
:
xmlstarlet ed -O -d "//Root/*[3]" input.xml
ed
- edit mode-O
- omit XML declaration (<?xml ...?>)
-d
- delete action"//Root/*[3]"
- xpath expression selecting the 3rd child node of the parent node Root
The output:
<Root>
<A>Keep</A>
<B>Keep</B>
<B>Keep</B>
<C>Keep</C>
</Root>
Upvotes: 3