Reputation: 23
I have the following XML document:
<RootNode>
<SubNode name="MainNode" SubNodeID="1">
<SubSubNode SubSubID="10" SubSubName="Product Food">
<Item subItemID="100" ItemName="Apple" OtherName="Gala"/>
<Item subItemID="101" ItemName="Apple" OtherName="Aroma"/>
<Item subItemID="102" ItemName="Pear" OtherName="Williams"/>
<Item subItemID="103" ItemName="Pear" OtherName="Abate"/>
<Item subItemID="104" ItemName="Cranberry" OtherName="Bilberry"/>
<Item subItemID="105" ItemName="Cranberry" OtherName="Bluberries"/>
<Item subItemID="106" ItemName="Strawberry" OtherName="Berry"/>
<Item subItemID="107" ItemName="Peach" OtherName="Nectarina"/>
</SubSubNode>
<SubSubNode SubSubID="20" SubSubName="Product Beverage">
<Item subItemID="108" ItemName="Cola" OtherName="Coca cola"/>
<Item subItemID="109" ItemName="Cola" OtherName="Pepsi"/>
<Item subItemID="110" ItemName="Orange Juice" OtherName="Fanta"/>
<Item subItemID="111" ItemName="Soft drink" OtherName="Grape soda"/>
<Item subItemID="112" ItemName="Soft drink" OtherName="Orange soda"/>
<Item subItemID="113" ItemName="Soft drink" OtherName="Grape soda"/>
</SubSubNode>
</SubNode>
</RootNode>
I load it with the usual statements:
tree = ET.parse('Food.xml')
root = tree.getroot()
I can find specific items with a specific attribute like OtherName="Gala" using
xPath = "SubNode/SubSubNode/Item[@OtherName='Gala']"
print(len(root.findall(xPath)))
What if I want to search for a text in any attribute? Using XPath statements I would write something like:
//*[@*[contains(., 'berry')]]
But implementing it in Python I got "SyntaxError: invalid predicate:"
search_text = "berry"
# XPath expression to match any element with any attribute containing 'search_text'
xpath_expr = ".//*[@*[contains(., '{search_text}')]]"
Any ideas? Thank you for your help
Upvotes: 0
Views: 39
Reputation: 3476
As described in the comments, lxml is the better way. Alternativ solution without xpath:
import xml.etree.ElementTree as ET
xml_s = """<RootNode>
<SubNode name="MainNode" SubNodeID="1">
<SubSubNode SubSubID="10" SubSubName="Product Food">
<Item subItemID="100" ItemName="Apple" OtherName="Gala"/>
<Item subItemID="101" ItemName="Apple" OtherName="Aroma"/>
<Item subItemID="102" ItemName="Pear" OtherName="Williams"/>
<Item subItemID="103" ItemName="Pear" OtherName="Abate"/>
<Item subItemID="104" ItemName="Cranberry" OtherName="Bilberry"/>
<Item subItemID="105" ItemName="Cranberry" OtherName="Bluberries"/>
<Item subItemID="106" ItemName="Strawberry" OtherName="some text"/>
<Item subItemID="107" ItemName="Peach" OtherName="Nectarina"/>
</SubSubNode>
<SubSubNode SubSubID="20" SubSubName="Product Beverage">
<Item subItemID="108" ItemName="Cola" OtherName="Coca cola"/>
<Item subItemID="109" ItemName="Cola" OtherName="Pepsi"/>
<Item subItemID="110" ItemName="Orange Juice" OtherName="Fanta"/>
<Item subItemID="111" ItemName="Soft drink" OtherName="Grape soda"/>
<Item subItemID="112" ItemName="some text" OtherName="Orange soda"/>
<Item subItemID="113" ItemName="Soft drink" OtherName="Grape soda"/>
</SubSubNode>
</SubNode>
</RootNode>"""
root = ET.fromstring(xml_s)
element_list = []
for some_text in root.iter():
if "some text" in some_text.attrib.values():
# print(some_text.tag, some_text.attrib)
element_list.append(some_text)
# Find the keys with "some text"
for elem in element_list:
keys = [k for k, v in elem.attrib.items() if v == 'some text']
print(keys)
Output:
['OtherName']
['ItemName']
Upvotes: 0