Reputation: 4767
I am looking for a solution to this problem and suspect awk should provide a simple enough solution instead of my clumsy shell script.
I have an xml file consisting of multiple sections as shown below. I also have a list of values.
For each section <top_tag> ... </top_tag>
where value_x is in my list, delete (ie:not print) the section <top_tag> ... </top_tag>
<xml>
<outer_tag>
<top_tag>
<tag>value_1</tag>
<other_tags></other_tags>
</top_tag>
<top_tag>
<tag>value_2</tag>
<other_tags></other_tags>
</top_tag>
...
<top_tag>
<tag>value_n</tag>
<other_tags></other_tags>
</top_tag>
</outer_tag>
Your suggestions are most appreciated.
Upvotes: 1
Views: 2784
Reputation: 58371
This might work for you:
sed -i '/<top_tag>/,/<\/top_tag>/!b;/<top_tag>/{h;d};H;/<\/top_tag/!d;x;/<tag>value.*<\/tag>/d' file
Upvotes: 2
Reputation: 18543
What you need here is not awk but XSLT, which was created specifically for this kind of tasks. It lets you transform an xml document into a different xml.
For an input much like yours:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="example.xsl"?>
<outer_tag>
<top_tag>
<tag>value_1</tag>
<other_tags></other_tags>
</top_tag>
<top_tag>
<tag>value_2</tag>
<other_tags></other_tags>
</top_tag>
<top_tag>
<tag>value_3</tag>
<other_tags></other_tags>
</top_tag>
<top_tag>
<tag>value_n</tag>
<other_tags></other_tags>
</top_tag>
</outer_tag>
The following XSLT removes all top_tag
elements with value_3
by simply not copying them and ignoring their contents.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="*">
<xsl:element name="{name()}">
<xsl:apply-templates select="child::node()"></xsl:apply-templates>
</xsl:element>
</xsl:template>
<xsl:template match="top_tag[tag = 'value_3']">
</xsl:template>
</xsl:stylesheet
Every major programming language has at least a couple of libraries that can process an XML input according to an XSLT. Command line tools and UI-based applications (IDEs but not only those) can do it as well. Finally, web browsers can transform files using XSLT if you include the xsl file with a processing instruction like this:
<?xml-stylesheet type="text/xsl" href="example.xsl"?>
Upvotes: 2