user1728778
user1728778

Reputation: 151

Join two xml files based on node text with xslt

Is it possible to join two xml files based on a node value like SQL?

I have two xml files:

<MailPackage>
   <Mail>
      <id>1</id>
      <field_1>foo</field_1>
      ...
      <field_n>bar</field_n>
   </Mail>
   <Mail>
      <id>2</id>
      <field_1>... </field_1>
       ...
   </Mail>
   ....
</MailPackackage>

and

<Transaction_data>
   <Transaction>
     <id>1</id>
     <account_number>10 </account_number>
     ....
   </Transaction>
   <Transaction>
     <id>1</id>
     <account_number> 50 </account_number>
      ....
   </Transaction>
   <Transaction>
     <id>2</id>
     <account_number> 20 </account_number>
      ....
   </Transaction>
</Transaction_data>

Now I'd like to join the two xml files by the value of the 'id' node. The expected result is:

<MailPackage>
   <Mail>
      <id>1 </id>
      <field_1>foo </field_1>
      ...
      <field_n>bar </field_n>
      <Transaction_data>
         <Transaction>
            <Account_number>10</Account_number>
             ...
         </Transaction>
         <Transaction>
            <Account_number>50 </Account_number>
             ...
         </Transaction>
      </Transaction_data>
   </Mail>
   <Mail>
      <id> 2 </id>
      <Field_1> ...</Field_1>
       ...
      <Transactions>
         <Transaction>
            <Account_number> 20 </Account_number>
             ....
         </Transaction>
      </Transactions>

   </Mail>
</MailPackage>

Can you guys give some help, how to begin?

Upvotes: 1

Views: 1023

Answers (1)

Ian Roberts
Ian Roberts

Reputation: 122364

You can define an <xsl:key> to group the Transaction elements by ID, then insert them at the appropriate places in the main file. This article explains a trick using <xsl:for-each> to select nodes matching a key from a secondary document - if you have XSLT 2.0 you don't need this trick, just use the three-argument form of the key() function.

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:strip-space elements="*"/>
  <xsl:output method="xml" indent="yes" />

  <xsl:key name="trans" match="Transaction" use="id" />

  <!-- Identity template to copy everything we don't specifically override -->
  <xsl:template match="@*|node()">
    <xsl:copy><xsl:apply-templates select="@*|node()" /></xsl:copy>
  </xsl:template>

  <!-- override for Mail elements -->
  <xsl:template match="Mail">
    <xsl:copy>
      <!-- copy all children as normal -->
      <xsl:apply-templates select="@*|node()" />
      <xsl:variable name="myId" select="id" />
      <Transaction_data>
        <xsl:for-each select="document('transactions.xml')">
          <!-- process all transactions with the right ID -->
          <xsl:apply-templates select="key('trans', $myId)" />
        </xsl:for-each>
      </Transaction_data>
    </xsl:copy>
  </xsl:template>

  <!-- omit the id element when copying a Transaction -->
  <xsl:template match="Transaction/id" />
</xsl:stylesheet>

You would process the <MailPackage> document as the main input document, and the stylesheet references the transactions document internally.

This all assumes that your Mail elements all have unique IDs.

Upvotes: 1

Related Questions