Reputation: 167
I'm a bit of a newbie to XPath, so I need some help figuring this out. I have an XML file like so:
<items>
<item>
<brandName>Brand 1</brandName>
<productTypes>
<productType>Type 1</productType>
<productType>Type 3</productType>
</productTypes>
</item>
<item>
<brandName>Brand 1</brandName>
<productTypes>
<productType>Type 2</productType>
<productType>Type 3</productType>
</productTypes>
</item>
<item>
<brandName>Brand 2</brandName>
<productTypes>
<productType>Type 4</productType>
<productType>Type 5</productType>
</productTypes>
</item>
</items>
I'm trying to figure out a way of getting all of the unique productType's for a specific brand. For example, all of the unique productType's for "Brand 1" would output "Type 1", "Type 2", "Type 3"
I've been googling without much luck. Any help would be appreciated!
Upvotes: 2
Views: 1105
Reputation: 726
This works:
(/items/item[brandName='Brand 1']/productTypes/productType)[not(text()=preceding::*)]
How it works: The first (...)
gets all the productType
of brandName='Brand 1'. At this point I have a list of productType
nodes. Now I select the nodes where the node text is not contained in nodes preceding the current node.
Tried in python:
n = libxml2dom.parseString(xml)
[x.textContent for x in n.xpath("(/items/item[brandName='Brand 1']/productTypes/productType)[not(text()=preceding::*)]")]
>>> [u'Type 1', u'Type 3', u'Type 2']
Upvotes: 3