atif
atif

Reputation: 1147

merging and restructing nodes content based on attribute values

I have an xml file which contains the following markup

 <xml>
   <content relationship="regula">
       **<source attribute1="RSC1985s5c1" attribute2="6(17)"/>**
       <target attribute1="LRC1985s5c1" attribute1="6(17)1"/>
   </content>

   <content relationship="translation-of">
       **<source attribute1="RSC1985s5c1" attribute2="6(17)"/>**
       <target attribute1="LRC1985s5c4" attribute2="6(17)1"/>
   </content>

   <content relationship="translation-of">
       **<source attribute1="RSC1985s5c2" attribute2="7(17)"/>**
       <target attribute1="LRC1985s5c2" attribute2="7(17)"/>
    </content>

     <content relationship="translation-of">
         **<source attribute1="RSC1985s5c1" attribute2="6(17)"/>**
           <target attribute1="LRC1985s5c6" attribute2="6(17)2"/>
     </content>

   </xml>

What i want is to merge the content of nodes in to one new node if the attribute1 and attrbite2 value of source nodes are equal. So the output should be like

   <xml>
    <transformed relationship="merged">
          <source attribute1="RSC1985s5c1" attribute2="6(17)"/>
          <target attribute1="LRC1985s5c1" attribute2="6(17)1"/>
          <target attribute1="LRC1985s5c4" attribute2="6(17)1"/>
          <target attribute1="LRC1985s5c6" attribute2="6(17)2"/>
    </transformed>

      <transformed relationship="non-merged">
          <source attribute1="RSC1985s5c2" attribute2="7(17)"/>
           <target attribute1="LRC1985s5c2" attribute2="7(17)"/>
    </transformed>
   </xml>

So the first two nodes have source attribute1 and attribute2 values equal to each other that's why i have combine them as a new node. Third node in source doesn't match with others that why i have output that separately. I tried using foreach loop but couldn't get a proper work around. Appreciate your help if we can achieve through using template match.

Any content nodes with same attribute of child node "source" should be grouped together regardless of their postion. The relationship will get changed to "merged" for the merged ones and non merged items it will be "non-merged"

Upvotes: 1

Views: 286

Answers (1)

Tim C
Tim C

Reputation: 70618

This could be achieved by Muenchian Grouping

Because you need to match on two separate attributes on the source element, you may need use a concatenated key, like so

<xsl:key name="dupes" 
  match="content/source" 
  use="concat(@attribute1, '|', @attribute2)" />

It is important to pick a concatenating character to separate the two attributes (a pipe in this case) that can never appear in the two attribute values.

Normally, to match the first element in each group, you can then do this

<xsl:apply-templates select="content/source
   [generate-id() = 
    generate-id(key('dupes', concat(@attribute1, '|', @attribute2))[1])]" />

However, you would need to do a bit of extra work, as you need to know which source elements have multiple items in the group, and which just consist of single items. So, to get groups with multiple items, you can do the following:

<xsl:apply-templates select="content/source
   [generate-id() = 
      generate-id(key('dupes', concat(@attribute1, '|', @attribute2))[1])]
      [count(key('dupes', concat(@attribute1, '|', @attribute2))) > 1]" />

Here is the full XSLT

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
   <xsl:key name="dupes" match="content/source" use="concat(@attribute1, '|', @attribute2)"/>

   <xsl:template match="/xml">
      <xsl:copy>
         <transformed relationship="merged">
            <xsl:apply-templates select="content/source[generate-id() = generate-id(key('dupes', concat(@attribute1, '|', @attribute2))[1])][count(key('dupes', concat(@attribute1, '|', @attribute2))) &gt; 1]"/>
         </transformed>
         <transformed relationship="non-merged">
            <xsl:apply-templates select="content/source[generate-id() = generate-id(key('dupes', concat(@attribute1, '|', @attribute2))[1])][count(key('dupes', concat(@attribute1, '|', @attribute2))) = 1]"/>
         </transformed>
      </xsl:copy>
   </xsl:template>

   <xsl:template match="source">
      <xsl:copy-of select="."/>
      <xsl:copy-of select="key('dupes', concat(@attribute1, '|', @attribute2))/following-sibling::target[1]"/>
   </xsl:template>
</xsl:stylesheet>

When applied to your sample XML, the following is output

<xml>
   <transformed relationship="merged">
      <source attribute1="RSC1985s5c1" attribute2="6(17)"/>
      <target attribute1="LRC1985s5c1" attribute2="6(17)1"/>
      <target attribute1="LRC1985s5c4" attribute2="6(17)1"/>
      <target attribute1="LRC1985s5c6" attribute2="6(17)2"/>
   </transformed>
   <transformed relationship="non-merged">
      <source attribute1="RSC1985s5c2" attribute2="7(17)"/>
      <target attribute1="LRC1985s5c2" attribute2="7(17)"/>
   </transformed>
</xml>

Upvotes: 1

Related Questions