Dave
Dave

Reputation: 21

XSLT: remove all but the first occurrence of a given node

I have XML something like this:

<MyXml>  
  <RandomNode1>  
    <TheNode>  
      <a/>  
      <b/>  
      <c/>  
    </TheNode>  
  </RandomeNode1>  
  <RandomNode2>  
  </RandomNode2>  
  <RandomNode3>  
    <RandomNode4>  
      <TheNode>  
        <a/>  
        <b/>  
        <c/>  
      </TheNode>    
    </RandomNode4>  
  </RandomNode3>  
</MyXml>

Where <TheNode> appears throughout the XML but not at the same level, often deep within other nodes. What I need to do is eliminate all occurrences of <TheNode> EXCEPT the first. The rest are redundant and taking up space. What would be the XSL that could do this?

I have something like this:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">  
  <xsl:output method="xml" indent="yes"/>  

  <xsl:template match="node() | @*">  
    <xsl:copy>  
      <xsl:apply-templates select="node() | @*" />  
    </xsl:copy>  
  </xsl:template>  

  <xsl:template match="//TheNode[position()!=1]">
  </xsl:template>

</xsl:stylesheet>

But that is not correct. Any suggestions?

Upvotes: 2

Views: 2974

Answers (3)

user357812
user357812

Reputation:

Other approach for the pattern would be:

<xsl:template match="TheNode[generate-id()
                             != generate-id(/descendant::TheNode[1)]"/>

Note: It's more likely that an absolute expression gets optimizated inteads of a relative expression like preceding::TheNode

Upvotes: 0

Alohci
Alohci

Reputation: 83116

<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" indent="yes"/>
  <xsl:template match="node() | @*">
    <xsl:copy>
      <xsl:apply-templates select="node() | @*"/>
    </xsl:copy>
  </xsl:template>
  <xsl:template match="TheNode[preceding::TheNode]"/>
</xsl:stylesheet>

Upvotes: 4

Tomalak
Tomalak

Reputation: 338376

//TheNode[position()!=1] does not work because here, position() is relative to the parent context of each <TheNode>. It would select all <TheNode>s which are not first within their respective parent.

But you were on the right track. What you meant was:

(//TheNode)[position()!=1]

Note the parentheses - they cause the predicate to be applied to the entire selected node-set, instead of to each node individually.

Unfortunately, even though this is valid XPath expression, it is not valid as a match pattern. A match pattern must be meaningful (applicable) to an individual node, it cannot be a select expression.

So @Alohci's solution,

//TheNode[preceding::TheNode]

is the correct way to express what you want.

Upvotes: 2

Related Questions