Tench
Tench

Reputation: 525

XSLT 2.0 Solution for Merging Sibling Elements With Same Name and Attribute Value

I am looking for a solution that will turn

<p>
<hi rend="bold">aa</hi>
<hi rend="bold">bb</hi>
<hi rend="bold">cc</hi>
Perhaps some text.
<hi rend="italic">dd</hi>
<hi rend="italic">ee</hi>
Some more text.
<hi rend="italic">ff</hi>
<hi rend="italic">gg</hi>
Foo.
</p>

into

<p>
<hi rend="bold">aabbcc</hi>
Perhaps some text.
<hi rend="italic">ddee</hi>
Perhaps some text.
<hi rend="italic">ffgg</hi>
Foo. 
</p>

but my solution should _not hardcode elements and the names of the attribute values (italic, bold). The XSLT should really concatenate ALL sibling elements that have the same name and the same attribute value. Everything else should be left untouched.

I have looked at the solutions that already exist out there but none of them seemed to satisfy all of my requirements.

If anybody has a handy XSLT stylesheet for this, I'd be much obliged.

Upvotes: 4

Views: 1954

Answers (3)

ABach
ABach

Reputation: 3738

In case a casual visitor should come along and wonder if there is an XSLT 1.0 solution for this problem, I offer the following. Note that I am not trying to diminish from Sean and Martin's correct answers; I am merely offering some flavor.

When this XSLT 1.0 solution:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:output omit-xml-declaration="no" indent="yes"/>
  <xsl:strip-space elements="*"/>

  <xsl:key
     name="kFollowing" 
     match="hi" 
     use="concat(@rend, '+', generate-id(following-sibling::text()[1]))" />

  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="/*">
    <p>
      <xsl:apply-templates 
        select="
          hi[generate-id() = 
             generate-id(
           key('kFollowing', 
             concat(@rend, '+', generate-id(following-sibling::text()[1])))[1])]" />
    </p>
  </xsl:template>

  <xsl:template match="hi">
    <xsl:copy>
      <xsl:apply-templates 
        select="@*|key('kFollowing', 
          concat(@rend, '+', generate-id(following-sibling::text()[1])))/text()" />
    </xsl:copy>
    <xsl:apply-templates select="following-sibling::text()[1]" />
  </xsl:template>

</xsl:stylesheet>

...is applied to the OP's original XML:

<p>
<hi rend="bold">aa</hi>
<hi rend="bold">bb</hi>
<hi rend="bold">cc</hi>
Perhaps some text.
<hi rend="italic">dd</hi>
<hi rend="italic">ee</hi>
Some more text.
<hi rend="italic">ff</hi>
<hi rend="italic">gg</hi>
Foo.
</p>

...the desired result is produced:

<p>
<hi rend="bold">aabbcc</hi>
Perhaps some text.
<hi rend="italic">ddee</hi>
Perhaps some text.
<hi rend="italic">ffgg</hi>
Foo. 
</p>

Upvotes: 1

Sean B. Durkin
Sean B. Durkin

Reputation: 12729

This XSLT 2.0 style-sheet will merge adjacent elements with common rend attribute.

<xsl:stylesheet version="2.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes" />
<xsl:strip-space elements="*" />  

<xsl:template match="@*|node()">
  <xsl:copy>
    <xsl:apply-templates select="@*|node()" />
  </xsl:copy>
</xsl:template>

<xsl:template match="*[*/@rend]">
  <xsl:copy>
    <xsl:apply-templates select="@*" />
    <xsl:for-each-group select="node()" group-adjacent="
       if (self::*/@rend) then
           concat( namespace-uri(), '|', local-name(), '|', @rend)
         else
           ''">
      <xsl:choose>
        <xsl:when test="current-grouping-key()" >
          <xsl:for-each select="current-group()[1]">
            <xsl:copy>
              <xsl:apply-templates select="@* | current-group()/node()" />
            </xsl:copy>
          </xsl:for-each>
        </xsl:when>
        <xsl:otherwise>
         <xsl:apply-templates select="current-group()" />
        </xsl:otherwise>
      </xsl:choose>
    </xsl:for-each-group>
  </xsl:copy>
</xsl:template>

</xsl:stylesheet>

The advantages of this solution over Martin's are:

  • This merges over all parent elements, not just p elements.
  • Faster. Merging is accomplished over a single xsl:for-each instead of two nested xsl:for-each
  • The non-rend attributes of the head merge-able element are copied to the output.

Note also:

  • The test for pure white-space nodes, to be excluded for the purpose of determining "adjacent" elements with a common name and rend attribute value, is completely obviated by the xsl:strip-space instruction. Thus the xsl:for-each instruction if fairly simple and readable.
  • As an alternative to the group-adjacent attribute value, you could use instead ...

    <xsl:for-each-group select="node()" group-adjacent="
       string-join(for $x in self::*/@rend return
         concat( namespace-uri(), '|', local-name(), '|', @rend),'')">
    

    Use whichever form you personally find more readable.

Upvotes: 6

Martin Honnen
Martin Honnen

Reputation: 167471

Is the name of that attribute (e.g. rend) known? In that case I think you want

<xsl:template match="p">
  <xsl:copy>
    <xsl:for-each-group select="*" group-adjacent="concat(node-name(.), '|', @rend)">
      <xsl:element name="{name()}" namespace="{namespace-uri()}">
         <xsl:copy-of select="@rend"/>
         <xsl:apply-templates select="current-group()/node()"/>
      </xsl:element>
     </xsl:for-each-group>
  </xsl:copy>
</xsl:template>

[edit] If there can be text node with content between the elements, as you have shown in the edit of your input, then you need to nest to groupings as in the sample

<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="2.0"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  exclude-result-prefixes="xs">

<xsl:template match="p">
  <xsl:copy>
    <xsl:for-each-group select="node() except text()[not(normalize-space())]" group-adjacent="boolean(self::*)">
      <xsl:choose>
        <xsl:when test="current-grouping-key()">
          <xsl:for-each-group select="current-group()" group-by="concat(node-name(.), '|', @rend)">
            <xsl:element name="{name()}" namespace="{namespace-uri()}">
               <xsl:copy-of select="@rend"/>
               <xsl:apply-templates select="current-group()/node()"/>
            </xsl:element>
          </xsl:for-each-group>
        </xsl:when>
        <xsl:otherwise>
          <xsl:apply-templates select="current-group()"/>
        </xsl:otherwise>
      </xsl:choose>
     </xsl:for-each-group>
  </xsl:copy>
</xsl:template>

</xsl:stylesheet>

Upvotes: 1

Related Questions