Bob Stuart
Bob Stuart

Reputation: 103

Using XSL Key to find unique elements based on a wildcard of attributes

XSLT 2 is preferred and I hope makes this easier.

Given a document similar to

<doc xmlns:bob="bob.com">
    <element bob:name="fred" bob:occupation="Dr">Stuff</element>
    <element bob:name="Bill" bob:occupation="Dr" bob:birthMonth="Jan"/>
    <element>Kill Me</element>
    <element bob:name="fred" bob:occupation="Dr">different Stuff</element>
</doc>

I would like to have all of the unique elements based on all the attributes in bob namespace. This is a representative sample but I will have much deeper nesting so all I want it to traverse the tree for

*[@bob:*] and get the unique set of those.

The hoped for output would look like

<doc xmlns:bob="bob.com">
    <element bob:name="fred" bob:occupation="Dr">Stuff</element>
    <element bob:name="Bill" bob:occupation="Dr" bob:birthMonth="Jan"/>
</doc>

where one element was removed for not having any @bob:* attributes and the other was removed for being a duplicate of the first based solely on the attributes.

I was trying to use a key but didn't seem to be doing it right

<xsl:key name="BobAttributes" match="//*" use="./@bob:*" />

I also tried creating a function that concatenated all the @bob attributes but that also didn't seem to do what I hoped.

<xsl:key name="BobAttributes" match="//*" use="functx:AllBobConcat(.)" />

 <xsl:function name="functx:AllBobConcat" as="xs:string*" 
    xmlns:functx="http://www.functx.com" >
    <xsl:param name="nodes" as="node()*"/> 

    <xsl:for-each select="$nodes/@bob:*">
        <xsl:value-of select="local-name(.)"/>
        <xsl:value-of select="."/>
    </xsl:for-each>
</xsl:function>

In both cases I was using "simple" XSL to filter out the unique ones maybe I blew it here? Variables added here to try and debug.

  <xsl:template match="*[@ism:*]" priority="100">
        <xsl:variable name="concat">
            <xsl:value-of select="functx:AllBobConcat(.)"/>
        </xsl:variable>

        <xsl:variable name="myID">
            <xsl:value-of select="generate-id() "/>
        </xsl:variable>

        <xsl:variable name="Keylookup">
            <xsl:value-of select="key('BobAttributes', $concat)"/>
        </xsl:variable>
        <xsl:value-of select="concat($concat, $Keylookup, $myID)"/>
        <xsl:if test="generate-id() = generate-id(key('BobAttributes', $concat)[1])">

            <xsl:apply-templates select="." mode="copy"/>
        </xsl:if>
    </xsl:template>

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="@*|node()" mode="copy">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()" />
        </xsl:copy>
    </xsl:template>

Looking forward to hearing what simple thing I overlooked or completely different cool approach I should have taken.

Upvotes: 1

Views: 911

Answers (2)

Michael Kay
Michael Kay

Reputation: 163342

I would define the function AllBobConcat as you do, and then use this as a grouping key:

<xsl:for-each-group select="element" group-by="f:AllBobConcat(.)">
  <xsl:if test="current-grouping-key() != ''">
    <xsl:copy-of select="current-group()[1]"/>
  </xsl:if>
</xsl:for-each-group>

Except that AllBobConcat needs to ensure the attributes are in a canonical order, so:

<xsl:function name="f:AllBobConcat" as="xs:string">
    <xsl:param name="node" as="element(element)"/> 
    <xsl:value-of>
      <xsl:for-each select="$node/@bob:*">
        <xsl:sort select="local-name()"/>
        <xsl:value-of select="local-name(.)"/>
        <xsl:value-of select="'='"/>
        <xsl:value-of select="."/>
        <xsl:value-of select="' '"/>
      </xsl:for-each>
    </xsl:value-of>
</xsl:function>

Also, you shouldn't be putting your functions in a namespace that belongs to someone else.

Upvotes: 2

user663031
user663031

Reputation:

There is almost certainly a better way, but FWIW, here's a brute-force, alarmingly procedural, horribly inefficient solution:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
    version="1.0"
    xmlns:bob="bob.com"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    >

  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="element">
    <xsl:variable name="this" select="."/>
    <xsl:variable name="match">
      <xsl:for-each select="preceding::element">
        <xsl:variable name="diff">
          <xsl:variable name="that" select="."/>
          <xsl:for-each select="$this/@bob:*">
            <xsl:variable name="att-name" select="name()"/>
            <xsl:variable name="att-val" select="."/>
            <xsl:for-each select="$that/@bob:*[name()=$att-name]">
              <xsl:if test=". != $att-val">
                DIFF
              </xsl:if>
            </xsl:for-each>
          </xsl:for-each>
        </xsl:variable>
        <xsl:if test="$diff = ''">MATCH</xsl:if>
      </xsl:for-each>
    </xsl:variable>

    <xsl:if test="$match = ''">
      <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
      </xsl:copy>
    </xsl:if>
  </xsl:template>

</xsl:stylesheet>

For each element, loop across all the preceding elements. For each attribute, check to see if the values are equal, and if not raise a DIFF flag. Any preceding element with no DIFF flags raised raises a MATCH flag. Then pass through an element only if no MATCH flags were raised.

Feels like I'm programming in assembly language. Now we'll sit back and wait for Michael to give us his one-liner using deep-equal or somesuch.

Upvotes: 1

Related Questions