M0rgenstern
M0rgenstern

Reputation: 411

Parsing XSD files does not work -> Cannot find any tags

I am currently trying to parse a XSD file in python using the lxml library. For testing purposes I copied the following file together:

<xs:schema targetNamespace="http://www.w3schools.com" elementFormDefault="qualified">  
  <xs:element name="note">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="to" type="xs:string"/>
        <xs:element name="from" type="xs:string"/>
        <xs:element name="heading" type="xs:string"/>
        <xs:element name="body" type="xs:string"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
<xs:simpleType name="BaselineShiftValueType">
  <xs:annotation>
    <xs:documentation>The actual definition is
            baseline | sub | super | <percentage> | <length> | inherit 
            not sure that union can do this 
    </xs:documentation>
  </xs:annotation>
  <xs:restriction base="string"/>
 </xs:simpleType>
</xs:schema>

Now I tried to get the children of the root (schema), which would be: xs:element and xs:simpleType. By iterating over the children of the root, everything works fine:

root = self.XMLTree.getroot()
for child in root:
    print("{}: {}".format(child.tag, child.attrib))

This leads to the output:

{http://www.w3.org/2001/XMLSchema}element: {'name': 'note'}
{http://www.w3.org/2001/XMLSchema}simpleType: {'name': 'BaselineShiftValueType'}

But when I want to have only children of a certain type, it does not work:

root = self.XMLTree.getroot()
element = self.XMLTree.find("element")
print(str(element))

This gives me the following output:

None

Also using findall or writing ./element or .//element does not change the result. I am quite sure I am missing something. What is the right way to do this?

Upvotes: 0

Views: 783

Answers (2)

Linkid
Linkid

Reputation: 547

To follow the @helderdarocha's answer, you can also define your namespace in a dictionary and use it in your search functions like in the python xml.etree.ElementTree doc:

ns = {'xs',"http://www.w3.org/2001/XMLSchema"}
element = self.XMLTree.find("element", ns)

Upvotes: 0

helderdarocha
helderdarocha

Reputation: 23637

You are missing the namespace. Unprefixed XPath selectors are considered as belonging to no namespace. You will have to register it with register_namespace:

self.XMLTree.register_namespace('xs',"http://www.w3.org/2001/XMLSchema")

and then use prefixed selectors to find your elements:

element = self.XMLTree.find("xs:element")

Upvotes: 1

Related Questions