Peter
Peter

Reputation: 23

xmlschema element is not an element of the schema

Decoding a part of a XML document using xmlschema and XPath, selecting all item elements that have the attribute name and value doc_id=2 fails.

This is my xml file simple.xml:

<?xml version="1.0" encoding="UTF-8"?>
<na:main
        xmlns:na="ames"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="ames ./simple.xsd">
    <na:item doc_id="1" ref_id="k1">content_k1</na:item>
    <na:item doc_id="2" ref_id="k2">content_k2</na:item>
</na:main>

And this is my xml schema file simple.xsd:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns:na="ames"
           targetNamespace="ames"
           elementFormDefault="qualified">

   <xs:complexType name="itemtype">
      <xs:simpleContent>
          <xs:extension base="xs:string">
               <xs:attribute name="doc_id" type="xs:int" />
               <xs:attribute name="ref_id" type="xs:string" />
         </xs:extension>
        </xs:simpleContent>
   </xs:complexType>

   <xs:complexType name="maintype">
         <xs:sequence>
            <xs:element name="item" maxOccurs="unbounded" type="na:itemtype" />
         </xs:sequence>
      </xs:complexType>

   <xs:element name="main" type="na:maintype" />

</xs:schema>

This is my python code:

>>> import xmlschema
>>> xs = xmlschema.XMLSchema('simple.xsd')
>>> xs.is_valid('simple.xml')
True
>>> xs.to_dict('simple.xml', ".//na:item[@doc_id=1]")
{'@doc_id': 1, '@ref_id': 'k1', '$': 'content_k1'}
>>> xs.to_dict('simple.xml', ".//na:item[@doc_id=2]")
---------------------------------------------------------------------------
XMLSchemaValidationError                  Traceback (most recent call last)
<ipython-input-57-8ff81c2eaf9c> in <module>
----> 1 xmlschema.XMLSchema('simple.xsd').to_dict('simple.xml', ".//na:item[@doc_id=2]")

/rao/uhome/rmol3/bin/anaconda3_rmol3_lglxs408_2/envs/MW41_Quicklook/lib/python3.7/site-packages/xmlschema/validators/schema.py in decode(self, source, path, schema_path, validation, *args, **kwargs)
   1553         """
   1554         data, errors = [], []
-> 1555         for result in self.iter_decode(source, path, schema_path, validation, *args, **kwargs):
   1556             if not isinstance(result, XMLSchemaValidationError):
   1557                 data.append(result)

/rao/uhome/rmol3/bin/anaconda3_rmol3_lglxs408_2/envs/MW41_Quicklook/lib/python3.7/site-packages/xmlschema/validators/schema.py in iter_decode(self, source, path, schema_path, validation, process_namespaces, namespaces, use_defaults, decimal_type, datetime_types, converter, filler, fill_missing, max_depth, depth_filler, lazy_decode, **kwargs)
   1540                 else:
   1541                     reason = "{!r} is not an element of the schema".format(elem)
-> 1542                     yield schema.validation_error(validation, reason, elem, source, namespaces)
   1543                     return
   1544

/rao/uhome/rmol3/bin/anaconda3_rmol3_lglxs408_2/envs/MW41_Quicklook/lib/python3.7/site-packages/xmlschema/validators/xsdbase.py in validation_error(self, validation, error, obj, source, namespaces, **_kwargs)
    904
    905         if validation == 'strict' and error.elem is not None:
--> 906             raise error
    907         return error
    908

XMLSchemaValidationError: failed validating <Element '{ames}item' at 0x7eff7913db90> with XMLSchema10(basename='simple.xsd', namespace='ames'):

Reason: <Element '{ames}item' at 0x7eff7913db90> is not an element of the schema

Instance:

  <na:item xmlns:na="ames" doc_id="2" ref_id="k2">content_k2</na:item>

Path: /na:main/na:item[2]

What is wrong with the XPath statement ".//na:item[@doc_id=2]"?

Upvotes: 2

Views: 1568

Answers (4)

Peter
Peter

Reputation: 23

More examples that work:

>>> import xmlschema
>>> xs = xmlschema.XMLSchema('simple.xsd')

>>> xs.to_dict('simple.xml', ".//na:item[@doc_id='2']", schema_path='.//na:item')
{'@doc_id': 2, '@ref_id': 'k2', '$': 'content_k2'}

>>> xs.to_dict('simple.xml', ".//na:item[@doc_id=2]", schema_path='.//na:item')
{'@doc_id': 2, '@ref_id': 'k2', '$': 'content_k2'}

Upvotes: 0

Davide Brunato
Davide Brunato

Reputation: 752

The schema processor doesn't find the matching schema element for decoding data because the provided path is not suitable to be used on schema elements. You have to provide an explicit schema_path that point to the right XSD element:

>>> xs.to_dict("simple.xml", "/na:main/na:item[@doc_id=2]", schema_path="/na:main/na:item")
{'@doc_id': 2, '@ref_id': 'k2', '$': 'content_k2'}

Upvotes: 2

kimbert
kimbert

Reputation: 2422

The XPath is not relevant - it can only be executed if the XML document can be parsed. But you are getting a Schema Validation error from the XML parser. It is claiming that the root tag in your document is not declared in your XSD. However, I have tested your XSD and XML in https://www.freeformatter.com/xml-validator-xsd.html and it validates OK.

Please check that the XML/XSD combination that you posted is the one that you tested with - that might explain the rather puzzling situation.

Upvotes: 0

Yitzhak Khabinsky
Yitzhak Khabinsky

Reputation: 22321

You can try the following XPath:

   /na:main/na:item[@doc_id='2']

Working XPath expression

Upvotes: 0

Related Questions