Sarra
Sarra

Reputation: 13

merge multiple xml file with shell script

I want to merge multiple xml files. I find a good command which merges perfectly two xml files Merge Command. so to merge multiple file, put the command into a shell script.

the script is as follow:

#!/bin/bash
for i in `ls recep` //recep is the directory containing the list of xmlfiles
do
saxon tt merge.xslt with=$i > aux //tt is a file, we create it and initiate 
                                    it to the first xml file 
cp aux  tt     
done
cat tt

however, the script excute only one merge

thank you for your help

Upvotes: 0

Views: 5559

Answers (2)

Martin Honnen
Martin Honnen

Reputation: 167716

As an alternative, in XSLT 3.0 it is also possible to execute the merge stylesheet on a collection of files, using uri-collection and transform and fold-left from https://www.w3.org/TR/xpath-functions-31/:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:math="http://www.w3.org/2005/xpath-functions/math"
    xmlns:mf="http://example.com/mf"
    exclude-result-prefixes="xs math mf"
    version="3.0">

    <xsl:param name="input-dir" as="xs:string?" select="'.'"/>
    <xsl:param name="file-selection-pattern" as="xs:string" select="'?select=*.xml'"/>

    <!-- saved merge.xslt from http://web.archive.org/web/20160809092524/http://www2.informatik.hu-berlin.de/~obecker/XSLT/#merge as original-merge.xslt -->
    <xsl:param name="merge-code-uri" as="xs:string" select="'original-merge.xslt'"/>
    <xsl:param name="merge-sheet" as="document-node()" select="doc($merge-code-uri)"/>

    <!-- 
    Call Saxon 9.8 with option -it to start with below template that allows merging a collection of files
    as specified by the parameters $input-dir and $file-selection-pattern.   
    -->
    <xsl:template name="xsl:initial-template">
        <xsl:variable name="input-uris" as="xs:anyURI*" select="uri-collection($input-dir || $file-selection-pattern)"/>
        <xsl:sequence select="mf:merge($input-uris)"/>
    </xsl:template>

    <xsl:function name="mf:merge" as="node()*">
        <xsl:param name="input-uris" as="xs:anyURI*"/>
        <xsl:sequence select="fold-left(tail($input-uris), doc(head($input-uris)), mf:merge#2)"/>
    </xsl:function>

    <xsl:function name="mf:merge" as="node()*">
        <xsl:param name="doc1" as="document-node()"/>
        <xsl:param name="doc2-uri" as="xs:string"/>
        <xsl:sequence select="transform(map { 
            'stylesheet-node' : $merge-sheet,
            'source-node' : $doc1,
            'stylesheet-params' : map { xs:QName('with') : $doc2-uri }
            })?output"/>
    </xsl:function>

</xsl:stylesheet>

A more detailed explanation is in http://xslt-3-by-example.blogspot.de/2017/07/functional-programming-with-fold-left.html.

Upvotes: 0

zx485
zx485

Reputation: 29052

I created a simple test case to check your script. So I downloaded the merge.xslt file mentioned and created some files.

Overall, the test case looks like this:

tt:

<?xml version="1.0" encoding="UTF-8"?>
<root>
  <List>
    <Field0>Value X</Field0>
  </List>
</root>

a1.xml - a4.xml in a sub-directory named recep with the FieldX values equal to the XML file number:

<root>
  <List>
    <Field1>Value X</Field1>
  </List>
</root>

Then I slightly modified your script to match my saxon installation:

#!/bin/bash
for i in `ls recep` 
do
  java -jar /pathToSaxon/saxon9he.jar --suppressXsltNamespaceCheck tt merge.xslt with=recep/$i > aux  
  cp aux tt     
done
cat tt

After executing the script tt contains:

<?xml version="1.0" encoding="UTF-8"?>
<root>
  <List>
    <Field0>Value X</Field0>
    <Field1>Value X</Field1>
    <Field2>Value X</Field2>
    <Field3>Value X</Field3>
    <Field4>Value X</Field4>
  </List>
</root>

So the final result is:
I cannot reproduce your error. It must be something else like a missing directory name (ls recep/*) or so, because merge.xslt does work as expected.

Upvotes: 1

Related Questions