Bill Velasquez
Bill Velasquez

Reputation: 893

Simplify XSD Schema using XQuery

We're building an XQuery tool to create documentation for XSD Schemas (specifically UBL 2.1 Schemas).

In order to do that, we need to simplify schemas built with extensive use of element references and complex types, to be inline definitions.

So an element like this:

   <xsd:element name="Order" type="OrderType"/>
   <xsd:complexType name="OrderType">
      <xsd:sequence>
         ...
         <xsd:element ref="cbc:UBLVersionID" minOccurs="0" maxOccurs="1"/>
         ...
      </xsd:sequence>
   </xsd:complexType>
    ...
   (in another file)
   <xsd:element name="UBLVersionID" type="UBLVersionIDType"/>
   <xsd:complexType name="UBLVersionIDType">
      <xsd:simpleContent>
         <xsd:extension base="xsd:string"/> 
      </xsd:simpleContent>
   </xsd:complexType>

Should be converted to:

    <xsd:element name="Order">
      <xsd:complexType>
         <xsd:sequence>
            ...
            <xsd:element ref="cbc:UBLVersionID" minOccurs="0" maxOccurs="1">
              <xsd:complexType name="UBLVersionIDType">
                <xsd:simpleContent>
                   <xsd:extension base="xsd:string"/> 
                </xsd:simpleContent>
            </xsd:complexType>
            ...
         </xsd:sequence>
      </xsd:complexType>    </xsd:element>

Taking in account that some elements and types are defined in imported schemas.

Is there a known method to get this with XQuery?

Thanks.

Upvotes: 0

Views: 843

Answers (2)

Michael Kay
Michael Kay

Reputation: 163352

You might find it useful to process the schemas into Saxon's SCM format, which is essentially an XML representation of the XSD schema components in normalized form. You can generate this form using

java com.saxonica.Validate -xsd:schema.xsd -scmout:schema.scm

The documentation for the SCM format is basically the schema component documentation in the W3C spec.

The format does the opposite of what you are asking for: all components are "out of line", accessed by following references. But it's highly uniform, and avoids all the complexities of managing includes, imports, namespaces, and QNames.

Upvotes: 1

adamretter
adamretter

Reputation: 3517

I do not know of a "known method", by which I assume you mean something out of the box or pre-built.

You would effectively need to write a custom transformation, which you could do in either XQuery or XSLT. In XQuery, this is basically a recursive descent and you can find examples here: https://en.wikibooks.org/wiki/XQuery/Transformation_idioms

However as @michael-kay points out, you will have to also write code to process imports and includes. So perhaps pre-processing to the SCM format first (before you do your inlining) is a good idea, of course you will then also need to write a transform to go from SCM back to schema...

Upvotes: 0

Related Questions