Reputation: 2552
I have an XSD file of the following format:
<?xml version="1.0" encoding="UTF-8"?><xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:type name="type1">
<xsd:example>
<xsd:description>This is the description of said type1 tag</xsd:description>
</xsd:example>
</xsd:type>
<xsd:type name="type2">
<xsd:example>
<xsd:description>This is the description of said type2 tag</xsd:description>
</xsd:example>
</xsd:type>
<xsd:type name="type3">
<xsd:example>
<xsd:description>This is the description of said type3 tag</xsd:description>
</xsd:example>
</xsd:type>
</xsd:schema>
and the following XML file:
<theRoot>
<type1>hi from type1</type1>
<theChild>
<type2>hi from type2</type2>
<type3>hi from type3</type3>
</theChild>
</theRoot>
I'd like to retrieve the value in between the xsd:description tag given that it is the child of the xsd:type tag with the name="type1" attribute. In other words, I'd like to retrieve "This is the description of said type1 tag".
I have tried to do this with lxml
in the following way using Python:
from lxml import etree
XSDDoc = etree.parse(xsdFile)
root = XSDDoc.getroot()
result = root.findall(".//xsd:type/xsd:example/xsd:description[@name='type1']", root.nsmap)
I've used the same example and solution mentioned here. However, what I have done just returns empty results and I'm not able to retrieve the correct result.
For reference, my Python version is: Python 2.7.10
EDIT: When I use an example provided in the answer by retrieving the XML structure from a string, the result is as expected. However, when I try to retrieve from a file, I get empty lists returned (or None).
I am doing the following:
The code loops over each node in a separate XML file, then checks in the XSD file to get each of the attributes as a result:
XMLDoc = etree.parse(open(xmlFile))
for Node in XMLDoc.xpath('//*'):
nameVariable = os.path.basename(XMLDoc.getpath(Node))
root = XSDDoc.getroot()
description = XSDDoc.find(".//xsd:type[@name='{0}']/xsd:example/xsd:description".format(nameVariable), root.nsmap)
If I try to print out the result.text
, I get:
AttributeError: 'NoneType' object has no attribute 'text'
Upvotes: 0
Views: 607
Reputation: 51042
The predicate ([@name='type1']
) must be applied in the right place. The name
attribute is on the xsd:type
element. This should work:
result = root.findall(".//xsd:type[@name='type1']/xsd:example/xsd:description", root.nsmap)
# result is a list
for r in result:
print(r.text)
In case you only want a single node, you can use find
instead of findall
. Complete example:
from lxml import etree
xsdFile = """
<root xmlns:xsd='http://whatever.com'>
<xsd:type name="type1">
<xsd:example>
<xsd:description>This is the description of said type1 tag</xsd:description>
</xsd:example>
</xsd:type>
</root>"""
root = etree.fromstring(xsdFile)
result = root.find(".//xsd:type[@name='type1']/xsd:example/xsd:description", root.nsmap)
print(result.text)
Upvotes: 1