jackzee
jackzee

Reputation: 21

Flatten nodes that have repeated child node using XSLT

I have a document of following structure (this is just an example to help me verbalize the problem), that I'm trying to flatten. By flattening I mean copying all the <Report_Entry> nodes with several <Event>s so that each <Report_Entry> node contained just a single <Event>

What I have:

<?xml version="1.0"?>
<Report_Data>
  <Report_Entry>
    <ID>1</ID>
    <Event>
      <Start_Date>2011-09-06</Start_Date>
      <End_Date>2011-09-10</End_Date>
    </Event>
    <Event>
      <Start_Date>2011-09-10</Start_Date>
      <End_Date>2011-09-15</End_Date>
    </Event>
    <Event>
      <Start_Date>2011-09-15</Start_Date>
      <End_Date>2011-09-20</End_Date>
    </Event>
  </Report_Entry>
  <Report_Entry>
    <ID>2</ID>
    <Event>
      <Start_Date>2011-09-20</Start_Date>
      <End_Date>2011-09-25</End_Date>
    </Event>
    <Event>
      <Start_Date>2011-09-25</Start_Date>
      <End_Date>2011-09-30</End_Date>
    </Event>
  </Report_Entry>
  <Report_Entry>
    <ID>3</ID>
    <Event>
      <Start_Date>2011-09-30</Start_Date>
      <End_Date>2011-10-05</End_Date>
    </Event>
  </Report_Entry>
</Report_Data>

What I'm trying to get:

<?xml version="1.0"?>
<Report_Data>
  <Report_Entry>
    <ID>1</ID>
    <Event>
      <Start_Date>2011-09-06</Start_Date>
      <End_Date>2011-09-10</End_Date>
    </Event>
  </Report_Entry>
  <Report_Entry>
    <ID>1</ID>
    <Event>
      <Start_Date>2011-09-10</Start_Date>
      <End_Date>2011-09-15</End_Date>
    </Event>
  </Report_Entry>
  <Report_Entry>
    <ID>1</ID>
    <Event>
      <Start_Date>2011-09-15</Start_Date>
      <End_Date>2011-09-20</End_Date>
    </Event>
  </Report_Entry>
  <Report_Entry>
    <ID>2</ID>
    <Event>
      <Start_Date>2011-09-20</Start_Date>
      <End_Date>2011-09-25</End_Date>
    </Event>
  </Report_Entry>
  <Report_Entry>
    <ID>2</ID>
    <Event>
      <Start_Date>2011-09-25</Start_Date>
      <End_Date>2011-09-30</End_Date>
    </Event>
  </Report_Entry>
  <Report_Entry>
    <ID>3</ID>
    <Event>
      <Start_Date>2011-09-30</Start_Date>
      <End_Date>2011-10-05</End_Date>
    </Event>
  </Report_Entry>
</Report_Data>

Here is XSLT that I'm using:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="node() | @*">
    <xsl:copy>
        <xsl:apply-templates select="node() | @*"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="Report_Entry">
  <xsl:for-each select="Event">
    <Report_Entry>
      <xsl:copy-of select="../*[not(self::Event)]"/>
      <xsl:copy-of select="."/>
    </Report_Entry>
  </xsl:for-each>
</xsl:template>

</xsl:stylesheet>

It works, though I feel that there might be a better, faster and more universal solution. In particular, I don't like "hardcoding" <Report_Entry> since this way I wouldn't be able to copy its attributes (if any). Are there other ways/templates to deal with this problem?

Upvotes: 0

Views: 776

Answers (2)

psmay
psmay

Reputation: 1021

Your answer couldn't be much simpler, so no need to worry on that front. When writing code, but especially XSLT, the clarity of the code tends to be worth a lot more than its ultimate efficiency.

As for the hardcoded element name and copying attributes, here's a start:

<xsl:template match="Report_Entry">
  <xsl:variable name="parent-name" select="name()"/>
  <xsl:variable name="parent-attributes" select="@*"/>
  <xsl:for-each select="Event">
    <xsl:element name="{$parent-name}">
      <xsl:copy-of select="$parent-attributes"/>
      <xsl:copy-of select="../*[not(self::Event)]"/>
      <xsl:copy-of select="."/>
    </xsl:element>
  </xsl:for-each>
</xsl:template>

variables are used to stash some of the context as it exists outside the for-each. element makes an element that's a look-alike for your original, no matter what it's called, and the first copy-of makes it more convincing by copying in the original's attributes as well. Now, if your data suddenly takes on attributes, you'll be ready.

The non-hardcodedness of the name doesn't mean much in this case, but it would if, say, you were to factor out that part to a separate template and call it from multiple places:

<xsl:template name="collapse-the-thing">
  <xsl:param name="context"/>
  <xsl:param name="sub-element-name" select="'Event'"/>
  <xsl:variable name="parent-name" select="name($context)"/>
  <xsl:variable name="parent-attributes" select="$context/@*"/>
  <xsl:for-each select="$context/*[name()=$sub-element-name]">
    <xsl:element name="{$parent-name}">
      <xsl:copy-of select="$parent-attributes"/>
      <xsl:copy-of select="../*[name()!=$sub-element-name]"/>
      <xsl:copy-of select="."/>
    </xsl:element>
  </xsl:for-each>
</xsl:template>

<xsl:template match="Report_Entry">
  <xsl:call-template name="collapse-the-thing">
    <xsl:with-param name="context" select="."/>
  </xsl:call-template>
</xsl:template>

<xsl:template match="Some_Other_Entry">
  <xsl:call-template name="collapse-the-thing">
    <xsl:with-param name="context" select="."/>
    <xsl:param name="sub-element-name" select="'Happening'"/>
  </xsl:call-template>
</xsl:template>

Hope that was enlightening. Enjoy!

Upvotes: 0

Dimitre Novatchev
Dimitre Novatchev

Reputation: 243599

As simple as this:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="/*">
  <Report_Data>
    <xsl:apply-templates select="*/Event"/>
  </Report_Data>
 </xsl:template>

 <xsl:template match="Event">
  <Report_Entry>
   <xsl:copy-of select="../ID | ."/>
  </Report_Entry>
 </xsl:template>
</xsl:stylesheet>

When this transformation is applied on the provided XML document:

<Report_Data>
  <Report_Entry>
    <ID>1</ID>
    <Event>
      <Start_Date>2011-09-06</Start_Date>
      <End_Date>2011-09-10</End_Date>
    </Event>
    <Event>
      <Start_Date>2011-09-10</Start_Date>
      <End_Date>2011-09-15</End_Date>
    </Event>
    <Event>
      <Start_Date>2011-09-15</Start_Date>
      <End_Date>2011-09-20</End_Date>
    </Event>
  </Report_Entry>
  <Report_Entry>
    <ID>2</ID>
    <Event>
      <Start_Date>2011-09-20</Start_Date>
      <End_Date>2011-09-25</End_Date>
    </Event>
    <Event>
      <Start_Date>2011-09-25</Start_Date>
      <End_Date>2011-09-30</End_Date>
    </Event>
  </Report_Entry>
  <Report_Entry>
    <ID>3</ID>
    <Event>
      <Start_Date>2011-09-30</Start_Date>
      <End_Date>2011-10-05</End_Date>
    </Event>
  </Report_Entry>
</Report_Data>

the wanted, correct result is produced:

<Report_Data>
   <Report_Entry>
      <ID>1</ID>
      <Event>
         <Start_Date>2011-09-06</Start_Date>
         <End_Date>2011-09-10</End_Date>
      </Event>
   </Report_Entry>
   <Report_Entry>
      <ID>1</ID>
      <Event>
         <Start_Date>2011-09-10</Start_Date>
         <End_Date>2011-09-15</End_Date>
      </Event>
   </Report_Entry>
   <Report_Entry>
      <ID>1</ID>
      <Event>
         <Start_Date>2011-09-15</Start_Date>
         <End_Date>2011-09-20</End_Date>
      </Event>
   </Report_Entry>
   <Report_Entry>
      <ID>2</ID>
      <Event>
         <Start_Date>2011-09-20</Start_Date>
         <End_Date>2011-09-25</End_Date>
      </Event>
   </Report_Entry>
   <Report_Entry>
      <ID>2</ID>
      <Event>
         <Start_Date>2011-09-25</Start_Date>
         <End_Date>2011-09-30</End_Date>
      </Event>
   </Report_Entry>
   <Report_Entry>
      <ID>3</ID>
      <Event>
         <Start_Date>2011-09-30</Start_Date>
         <End_Date>2011-10-05</End_Date>
      </Event>
   </Report_Entry>
</Report_Data>

Upvotes: 1

Related Questions