Reputation: 1
I need to correct a few hundred of XML files.
Let's say files are of this format:
<?xml version="1.0" encoding="UTF-8"?>
<MyData xmlns="urn:iso" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:iso">
<Hdr>
<AppHdr xmlns="urn:iso" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:iso">
<St>A</St>
<To>Z</To>
</Hdr>
<Data>
<Document xmlns="urn:iso" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:iso">
<CountryReport>
<RptHdr>
<RpDtls>
<Dt>2018-07-10</Dt>
</RpDtls>
</RptHdr>
<Country>
<Id>PT</Id>
<FullNm>Portugal</FullNm>>
<Bd>
<Tp>US</Tp>
</Bd>
</Country>
<Country>
<Id>ESP</Id>
<FullNm>Spain</FullNm>>
<Bd>
<Tp>EUR</Tp>
</Bd>
</Country>
</CountryReport>
</Document>
</Data>
</MyData>
The replacement I need to do is the following:
I've try different ways using sed, xmllint and ElementTrees using python but without success.
I may be using the wrong xpath but I unfortunately cannot figure it out.
Can you help?
Upvotes: 0
Views: 96
Reputation: 29022
The easiest way to achieve your goal would be using an XSLT-processor. For example, use a script that calls the Linux program xsltproc
or the Windows/Linux program saxon
.
Because your elements are in a namespace, you have to define it for your elements. For example, add xmlns:ui="urn:iso"
to your xsl:stylesheet
element and then use the following template in combination with the identity template:
<xsl:template match="ui:Country[ui:Id='PT']/ui:Bd/ui:Tp">
<xsl:element name="Tp" namespace="{namespace-uri()}">EUR</xsl:element>
</xsl:template>
The identity template of XSLT-1.0 is:
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*" />
</xsl:copy>
</xsl:template>
With XSLT-3.0 you could use the following instruction instead:
<xsl:mode on-no-match="shallow-copy" />
So a complete XSLT-1.0 file to transform all of your XML files could look like:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:ui="urn:iso">
<xsl:output method="xml" omit-xml-declaration="yes" indent="yes"/>
<!-- identity template -->
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*" />
</xsl:copy>
</xsl:template>
<xsl:template match="ui:Country[ui:Id='PT']/ui:Bd/ui:Tp">
<xsl:element name="Tp" namespace="{namespace-uri()}">EUR</xsl:element>
</xsl:template>
</xsl:stylesheet>
An xsltproc
bash command could look like
for file in *; do xsltproc transform.xslt $file > $file.NEW; done;
Upvotes: 3