Reputation: 1237
I have in excess of 500 xml files, all with a similar structure.
Each has a <stream>
tag and its corresponding </stream>
tag, with many lines of text in between. Is there a way to quickly remove everything between the two tags (possibly including the tags themselves) without having to manually select, delete all the text (which is a lot)?
I use notepad to open these files but can use other software if needed.
Upvotes: 0
Views: 377
Reputation: 167581
Use XSLT e.g.
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="stream"/>
</xsl:stylesheet>
There are various XSLT processors having a command line interface or you can use a few lines of Powershell (xslt.xsl
is the above saved under that name) e.g.
$xslt = New-Object System.Xml.Xsl.XslCompiledTransform
$xslt.Load("xslt.xsl")
$xslt.Transform("input.xml", "output.xml")
Upvotes: 2