Jeff Chew
Jeff Chew

Reputation: 167

Getting unique XPath node values in nested nodes

I'm a bit of a newbie to XPath, so I need some help figuring this out. I have an XML file like so:

<items>
    <item>
        <brandName>Brand 1</brandName>
        <productTypes>
            <productType>Type 1</productType>
            <productType>Type 3</productType>
        </productTypes>
    </item>
    <item>
        <brandName>Brand 1</brandName>
        <productTypes>
            <productType>Type 2</productType>
            <productType>Type 3</productType>
        </productTypes>
    </item>
    <item>
        <brandName>Brand 2</brandName>
        <productTypes>
            <productType>Type 4</productType>
            <productType>Type 5</productType>
        </productTypes>
    </item>
</items>

I'm trying to figure out a way of getting all of the unique productType's for a specific brand. For example, all of the unique productType's for "Brand 1" would output "Type 1", "Type 2", "Type 3"

I've been googling without much luck. Any help would be appreciated!

Upvotes: 2

Views: 1105

Answers (1)

AlbertFerras
AlbertFerras

Reputation: 726

This works:

(/items/item[brandName='Brand 1']/productTypes/productType)[not(text()=preceding::*)]

How it works: The first (...) gets all the productType of brandName='Brand 1'. At this point I have a list of productType nodes. Now I select the nodes where the node text is not contained in nodes preceding the current node.

Tried in python:

n = libxml2dom.parseString(xml)
[x.textContent for x in n.xpath("(/items/item[brandName='Brand 1']/productTypes/productType)[not(text()=preceding::*)]")]
>>> [u'Type 1', u'Type 3', u'Type 2']

Upvotes: 3

Related Questions