Reputation:
File a.xml:
<?xml version="1.0" encoding="UTF-8"?>
<TABLE NAME="pivot.cs">
<DATA RECORDS="2">
<RECORD ID="1">
<INTERNALID>5510</INTERNALID>
<SOMED>1</SOMED>
<PEMED>1</PEMED>
<CODAL>PLACEHOLD</CODAL>
</RECORD>
<RECORD ID="2">
<INTERNALID>5511</INTERNALID>
<SOMED>1</SOMED>
<PEMED>1</PEMED>
<CODAL>PLACEHOLD</CODAL>
</RECORD>
<INTERNALID>5537</INTERNALID>
<SOMED>1</SOMED>
<PEMED>1</PEMED>
<CODAL>PLACEHOLD</CODAL>
</DATA>
</TABLE>
file b.xml:
<?xml version="1.0" encoding="UTF-8"?>
<TABLE NAME="ALT.CS">
<DATA RECORDS="20">
<RECORD ID="53">
<RECNO>5510</RECNO>
<TOBEEXTRACTED>TIM</TOBEEXTRACTED>
</RECORD>
<RECORD ID="53">
<RECNO>5510</RECNO>
<TOBEEXTRACTED>KLM</TOBEEXTRACTED>
</RECORD>
<RECORD ID="54">
<RECNO>5510</RECNO>
<TOBEEXTRACTED>KAB</TOBEEXTRACTED>
</RECORD>
<RECORD ID="55">
<RECNO>5511</RECNO>
<TOBEEXTRACTED>BUS WEE</TOBEEXTRACTED>
</RECORD>
<RECORD ID="59">
<RECNO>5512</RECNO>
</RECORD>
<RECORD ID="60">
<RECNO>5513</RECNO>
</RECORD>
<RECORD ID="5511">
<RECNO>5598</RECNO>
<TOBEEXTRACTED>FBV</TOBEEXTRACTED>
</RECORD>
</RECORD>
</DATA>
</TABLE>
and output file should be, the file a.xml, but with the TOBEEXTRACTED element text appended into [], if matched one or two times:
<?xml version="1.0" encoding="UTF-8"?>
<TABLE NAME="pivot.cs">
<DATA RECORDS="2">
<RECORD ID="1">
<INTERNALID>5510</INTERNALID>
<SOMED>1</SOMED>
<PEMED>1</PEMED>
<CODAL>PLACEHOLD</CODAL>
</RECORD>
<RECORD ID="2">
<INTERNALID>5511</INTERNALID>
<SOMED>1</SOMED>
<PEMED>1</PEMED>
<CODAL>PLACEHOLD [BUS WEE]</CODAL>
</RECORD>
<INTERNALID>5537</INTERNALID>
<SOMED>1</SOMED>
<PEMED>1</PEMED>
<CODAL>PLACEHOLD</CODAL>
</DATA>
</TABLE>
Also, it would be of much help, if we could have a txt file as output, that would have the following info: from file a.xml,
INTERNALID: 5511 (and all the rest in a normal xml file) was matched.
INTERNALID: 5510 was matched more than two times, so no join took place.
INTERNALID: 5537 did not match
RECNO 5512 did not have a TOBEEXTRACTED element.
Upvotes: 0
Views: 655
Reputation: 167716
If you use a key as suggested in a comment you can reference and match elements as follows:
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:param name="doc2">
<TABLE NAME="ALT.CS">
<DATA RECORDS="20">
<RECORD ID="53">
<RECNO>5510</RECNO>
<TOBEEXTRACTED>TIM</TOBEEXTRACTED>
</RECORD>
<RECORD ID="53">
<RECNO>5510</RECNO>
<TOBEEXTRACTED>KLM</TOBEEXTRACTED>
</RECORD>
<RECORD ID="54">
<RECNO>5510</RECNO>
<TOBEEXTRACTED>KAB</TOBEEXTRACTED>
</RECORD>
<RECORD ID="55">
<RECNO>5511</RECNO>
<TOBEEXTRACTED>BUS WEE</TOBEEXTRACTED>
</RECORD>
<RECORD ID="59">
<RECNO>5512</RECNO>
</RECORD>
<RECORD ID="60">
<RECNO>5513</RECNO>
</RECORD>
<RECORD ID="5511">
<RECNO>5598</RECNO>
<TOBEEXTRACTED>FBV</TOBEEXTRACTED>
</RECORD>
</DATA>
</TABLE>
</xsl:param>
<xsl:key name="ref" match="DATA/RECORD[TOBEEXTRACTED]" use="RECNO"/>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="DATA/RECORD[key('ref', INTERNALID, $doc2)]/CODAL">
<xsl:copy>
<xsl:apply-templates select="node(), key('ref', ../INTERNALID, $doc2)/TOBEEXTRACTED"/>
</xsl:copy>
</xsl:template>
<xsl:template match="DATA/RECORD[not(key('ref', INTERNALID, $doc2))]"/>
<xsl:template match="TOBEEXTRACTED">
<xsl:value-of select="concat(' [', ., ']')"/>
</xsl:template>
</xsl:transform>
That gives the output you have posted at http://xsltransform.net/a9Giwy. There I have used an xsl:param name="doc2"
with inline contents but you can of course use <xsl:param name="doc2" select="doc('fileb.xml')"/>
instead.
As in an edit the question was additionally tagged as xslt-3.0 I have also tried to implement it using the xsl:merge
instruction of that version:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:math="http://www.w3.org/2005/xpath-functions/math"
exclude-result-prefixes="xs math"
version="3.0">
<xsl:param name="doc2-uri" as="xs:string" select="'test201705120102.xml'"/>
<xsl:mode on-no-match="shallow-copy"/>
<xsl:output indent="yes"/>
<xsl:template match="TABLE/DATA">
<xsl:copy>
<xsl:copy-of select="@*"/>
<xsl:merge>
<xsl:merge-source name="internal" select="RECORD" >
<xsl:merge-key select="INTERNALID"/>
</xsl:merge-source>
<xsl:merge-source name="recno" select="doc($doc2-uri)//RECORD">
<xsl:merge-key select="RECNO"/>
</xsl:merge-source>
<xsl:merge-action>
<xsl:if test="current-merge-group('internal') and current-merge-group('recno')">
<xsl:copy>
<xsl:copy-of select="@*, * except CODAL"/>
<CODAL>
<xsl:value-of select="CODAL, current-merge-group('recno')/TOBEEXTRACTED/('[' || . || ']')"/>
</CODAL>
</xsl:copy>
</xsl:if>
</xsl:merge-action>
</xsl:merge>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Upvotes: 0
Reputation: 163595
This kind of merging can often be accomplished using xsl:for-each-group:
<xsl:for-each-group select="$doc1//REC, $doc2//REC" group-by="RECNO">
...
</xsl:for-each-group>
in the body, current-group() holds the records from both files with the required key. You can separate them out with, for example
<xsl:variable name="doc1rec" select="current-group()[(/) is $doc1]"/>
<xsl:variable name="doc2rec" select="current-group()[(/) is $doc2]"/>
and then the remaining processing should be straightforward if you understand the logic (which I don't).
Upvotes: 0