Reputation: 21
I have 2 XML files that I want to merge, but I do not want to change any of the existing elements from the original file. What is the best way to do this on a linux system?
Note: there are posts about using XSLT that seem to be close to what I need, but I do not have an XSLT processor installed (nor do I have rights to install it). That said, I do have xsltproc
installed, but I'm not sure that this will help. If xsltproc
would help, please provide a suitable command line example.
Here is snippet of the original file:
<?xml version="1.0" encoding="utf-8"?>
<config xmlns:xsi="http://www.w3.org/2000/10/XMLSchema-instance">
<Comment>This file was automatically generated.</Comment>
<FieldAttrs>
<Name>FieldAttrsAll</Name>
<Field>
<Name>wLegExchInstIds</Name>
<Fid>6203</Fid>
<Type>StringVector</Type>
<CheckModified>true</CheckModified>
<PublishField>true</PublishField>
<ClearDaily>false</ClearDaily>
</Field>
<Field>
<Name>wPartitionId</Name>
<Fid>5886</Fid>
<Type>Integer</Type>
<CheckModified>true</CheckModified>
<PublishField>true</PublishField>
<ClearDaily>false</ClearDaily>
</Field>
</FieldAttrs>
</config>
And here is the new file I need to merge:
<?xml version="1.0" encoding="utf-8"?>
<config xmlns:xsi="http://www.w3.org/2000/10/XMLSchema-instance">
<Comment>This file was automatically generated.</Comment>
<FieldAttrs>
<Name>FieldAttrsAll</Name>
<Field>
<Name>wLegExchInstIds</Name>
<Fid>6203</Fid>
<Type>StringVector</Type>
<CheckModified>false</CheckModified>
<PublishField>false</PublishField>
<ClearDaily>false</ClearDaily>
</Field>
<Field>
<Name>wPartitionId</Name>
<Fid>5886</Fid>
<Type>Integer</Type>
<CheckModified>false</CheckModified>
<PublishField>false</PublishField>
<ClearDaily>false</ClearDaily>
</Field>
<Field>
<Name>wUnverifiedPriceIndicator</Name>
<Fid>5885</Fid>
<Type>Bool</Type>
<CheckModified>true</CheckModified>
<PublishField>true</PublishField>
<ClearDaily>true</ClearDaily>
</Field>
<Field>
<Name>wCorrIsIrregular</Name>
<Fid>5884</Fid>
<Type>Bool</Type>
<CheckModified>false</CheckModified>
<PublishField>true</PublishField>
<ClearDaily>true</ClearDaily>
</Field>
</FieldAttrs>
</config>
In particular note 2 things:
Given the above files, I want the output to look as follows:
<config xmlns:xsi="http://www.w3.org/2000/10/XMLSchema-instance">
<Comment>This file was automatically generated.</Comment>
<FieldAttrs>
<Name>FieldAttrsAll</Name>
<Field>
<Name>wLegExchInstIds</Name>
<Fid>6203</Fid>
<Type>StringVector</Type>
<CheckModified>true</CheckModified>
<PublishField>true</PublishField>
<ClearDaily>false</ClearDaily>
</Field>
<Field>
<Name>wPartitionId</Name>
<Fid>5886</Fid>
<Type>Integer</Type>
<CheckModified>true</CheckModified>
<PublishField>true</PublishField>
<ClearDaily>false</ClearDaily>
</Field>
<Field>
<Name>wUnverifiedPriceIndicator</Name>
<Fid>5885</Fid>
<Type>Bool</Type>
<CheckModified>true</CheckModified>
<PublishField>true</PublishField>
<ClearDaily>true</ClearDaily>
</Field>
<Field>
<Name>wCorrIsIrregular</Name>
<Fid>5884</Fid>
<Type>Bool</Type>
<CheckModified>false</CheckModified>
<PublishField>true</PublishField>
<ClearDaily>true</ClearDaily>
</Field>
</FieldAttrs>
</config>
Upvotes: 0
Views: 736
Reputation: 107622
Consider the following XSLT that uses the document()
function to parse from external XML. This approach actually begins with the larger XML file parsing values from the shorter XML to remove duplicates as opposed to add distinct nodes:
XSLT (save as .xsl file, references second XML file to be saved in same directory as first one)
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output version="1.0" encoding="UTF-8" indent="yes" />
<xsl:strip-space elements="*"/>
<!-- Identity Transform -->
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="FieldAttrs">
<xsl:copy>
<xsl:copy-of select="Name"/>
<xsl:copy-of select="document('ShorterXML.xml')/config/FieldAttrs/Field"/>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
<xsl:template match="Field[Name=document('ShorterXML.xml')/config/FieldAttrs/Field/Name]"/>
</xsl:transform>
Linux command line (references only one of the XML files as input all in same directory)
xsltproc transform.xsl LongerXML.xml -o output.xml
Output
<?xml version="1.0" encoding="UTF-8"?>
<config xmlns:xsi="http://www.w3.org/2000/10/XMLSchema-instance">
<Comment>This file was automatically generated.</Comment>
<FieldAttrs>
<Name>FieldAttrsAll</Name>
<Field>
<Name>wLegExchInstIds</Name>
<Fid>6203</Fid>
<Type>StringVector</Type>
<CheckModified>true</CheckModified>
<PublishField>true</PublishField>
<ClearDaily>false</ClearDaily>
</Field>
<Field>
<Name>wPartitionId</Name>
<Fid>5886</Fid>
<Type>Integer</Type>
<CheckModified>true</CheckModified>
<PublishField>true</PublishField>
<ClearDaily>false</ClearDaily>
</Field>
<Name>FieldAttrsAll</Name>
<Field>
<Name>wUnverifiedPriceIndicator</Name>
<Fid>5885</Fid>
<Type>Bool</Type>
<CheckModified>true</CheckModified>
<PublishField>true</PublishField>
<ClearDaily>true</ClearDaily>
</Field>
<Field>
<Name>wCorrIsIrregular</Name>
<Fid>5884</Fid>
<Type>Bool</Type>
<CheckModified>false</CheckModified>
<PublishField>true</PublishField>
<ClearDaily>true</ClearDaily>
</Field>
</FieldAttrs>
</config>
Upvotes: 1
Reputation: 241868
I was able to merge the two files in the given way using xsh, a wrapper around XML::LibXML that uses libxml2
under the hood:
my $old := open old.xml ;
$field := hash Name //Field ;
open new.xml ;
for //Field {
$exists = xsh:lookup('field', Name) ;
if not($exists)
copy . into $old/config/FieldAttrs ;
}
save :f merged.xml $old ;
Upvotes: 0