Gavin Sutherland
Gavin Sutherland

Reputation: 1686

How do I identify duplicate nodes in XPath 1.0 using an XPathNavigator to evaluate?

I am trying to identify duplicate serial numbers from the following xml using XPath 1.0 and then evaluating it in .Net using an XPathNavigator.

<?xml version="1.0" encoding="utf-16"?>
<Inventory xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
    <Items>
        <Item>
            <SerialNumber>1111</SerialNumber>
        </Item>
        <Item>
            <SerialNumber>1112</SerialNumber>
        </Item>
        <Item>
            <SerialNumber>1112</SerialNumber>
        </Item>
    </Items>
</Inventory>

I tried to do this by evaluating this

//Items/Item/SerialNumber

expression in a custom XSLT Context Function (implementing IXsltContextFunction like this MSDN example) in .Net but the Invoke function gets called one result at a time so I have no visibility of the other results to find duplicates.

1) Is there a way of doing this using a single XPath 1.0 expression?

OR

2) Is there a way of passing in an array of elements into a single Invoke call of the custom XSLT Context Function class? I'm working in VB.Net but am happy with any C# examples anyone can share.

Thanks,

Gavin

Edit

Thanks to O R Mapper and Dimitre for their responses. I initially accepted O R Mapper's response since it did do what I asked. I've since accepted Dimitre's answer since I like that it provides a distinct list of values. Both responses very helpful though!

Upvotes: 2

Views: 3788

Answers (2)

Dimitre Novatchev
Dimitre Novatchev

Reputation: 243549

Use:

/*/*/Item
      [SerialNumber = following-sibling::Item/SerialNumber
     and
       not(SerialNumber = preceding-sibling::Item/SerialNumber)
      ]

This selects just one Item element for any group of Item elements that have a SerialNumber child with the same string value.

XSLT - based verification:

<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="yes" indent="yes"/>
    <xsl:strip-space elements="*"/>

 <xsl:template match="/">
     <xsl:copy-of select=
      "/*/*/Item
          [SerialNumber = following-sibling::Item/SerialNumber
         and
           not(SerialNumber = preceding-sibling::Item/SerialNumber)
          ]"/>
 </xsl:template>
</xsl:stylesheet>

When this transformation is applied on this XML document (based on the provided one, but made more interesting):

<Inventory>
    <Items>
        <Item>
            <SerialNumber>1111</SerialNumber>
        </Item>
        <Item>
            <SerialNumber>2222</SerialNumber>
        </Item>
        <Item>
            <SerialNumber>2222</SerialNumber>
        </Item>
        <Item>
            <SerialNumber>2222</SerialNumber>
        </Item>
        <Item>
            <SerialNumber>1111</SerialNumber>
        </Item>
        <Item>
            <SerialNumber>1111</SerialNumber>
        </Item>
        <Item>
            <SerialNumber>3333</SerialNumber>
        </Item>
    </Items>
</Inventory>

the transformation evaluates the XPath expression and copies the selected nodes to the output:

<Item>
   <SerialNumber>1111</SerialNumber>
</Item>
<Item>
   <SerialNumber>2222</SerialNumber>
</Item>

Finally, if you want to get just the SerialNumber duplicate values, use:

   /*/*/Item
          [SerialNumber = following-sibling::Item/SerialNumber
         and
           not(SerialNumber = preceding-sibling::Item/SerialNumber)
          ]
           /SerialNumber/text()

Upvotes: 3

O. R. Mapper
O. R. Mapper

Reputation: 20760

I'm going to answer 1), so 2) should not matter any more:

You can use the preceding-sibling axis on your <Item> elements to find any preceding <Item> elements with the same serial number.

Try this (written so that it returns only the serial numbers themselves rather than elements - if this is not quite what you want, and you don't know how to change the result, let me know):

/Inventory/Items/Item/SerialNumber/node()[.=../../preceding-sibling::Item/SerialNumber/node()]

For your sample document, it returns

1112

Upvotes: 4

Related Questions