Sander_P
Sander_P

Reputation: 1835

How to do schema validation to get missing references with SaxonJS

If I have a simple xsd file and a simple xml file, can SaxonJS show which elements and which attributes in the xml are not defined in the xsd?

I have been looking around for examples but haven't been able to find anything so far.

Update

I'll also accept an answer with js code (node) that uses saxon-js to traverse trough a xml resource and checks elements and attributes (doesn't have to check attribute values) in a xsd resource.

In a somewhat efficient way.

Upvotes: 1

Views: 684

Answers (1)

Martin Honnen
Martin Honnen

Reputation: 167571

To simply check the existence of an xs:element element in the schema a key would suffice, the same for xs:attribute. But all that relies on simple xs:element name="foo" and xs:attribute name="att1" declarations being used and in no way checks the nesting or structure:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="3.0"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  exclude-result-prefixes="#all"
  expand-text="yes">

  <xsl:output method="xml" indent="yes"/>
  
  <xsl:key name="element-by-name" match="xs:element" use="QName(/*/@targetNamespace, @name)"/>
  
  <xsl:key name="attribute-by-name" match="xs:attribute" use="QName(/*/@targetNamespace, @name)"/>
  
  <xsl:template match="*[not(key('element-by-name', node-name(), $schema-doc))]">
    <element-not-declared>
      <name>{node-name()}</name>
      <path>{path()}</path>
    </element-not-declared>
    <xsl:next-match/>
  </xsl:template>
  
  <xsl:template match="@*[not(key('attribute-by-name', node-name(), $schema-doc))]">
    <attribute-not-declared>
      <name>{node-name()}</name>
      <path>{path()}</path>
    </attribute-not-declared>
  </xsl:template>

  <xsl:mode on-no-match="shallow-skip"/>

  <xsl:template match="/" name="xsl:initial-template">
    <xsl:next-match/>
    <xsl:comment xmlns:saxon="http://saxon.sf.net/">Run with {system-property('xsl:product-name')} {system-property('xsl:product-version')} {system-property('Q{http://saxon.sf.net/}platform')}</xsl:comment>
  </xsl:template>
  
  <xsl:param name="schema-doc">
    <xs:schema>
      <xs:element name="root">
        <xs:complexType>
          <xs:sequence>
            <xs:element name="items">
              <xs:complexType>
                <xs:sequence>
                  <xs:element name="item" maxOccurs="unbounded">
                    <xs:complexType>
                      <xs:element name="foo" type="xs:string"/>
                    </xs:complexType>
                  </xs:element>
                </xs:sequence>
              </xs:complexType>
            </xs:element>
          </xs:sequence>
        </xs:complexType>
      </xs:element>
    </xs:schema>
  </xsl:param>

</xsl:stylesheet>

Sample input

<?xml version="1.0" encoding="utf-8"?>
<root>
  <items count="1">
    <item>
      <foo>foo 1</foo>
      <bar>bar 1</bar>
    </item>
  </items>
</root>

when run with Saxon-JS 2.3 in the browser gives

    <attribute-not-declared>
       <name>count</name>
       <path>/Q{}root[1]/Q{}items[1]/@count</path>
    </attribute-not-declared>
    <element-not-declared>
       <name>bar</name>
       <path>/Q{}root[1]/Q{}items[1]/Q{}item[1]/Q{}bar[1]</path>
    </element-not-declared>
    <!--Run with Saxon-JS 2.3 Browser-->

but I have tested that it works with "Saxon-JS 2.3 Node.js" as well.

So this finds some elements or attributes for which no matching declaration is found, based on the keys used into the schema. It is just meant as a superficial approach, not even taking the complications of namespaces and elementForm or attributeForm and namespaces into account.

To run XSLT 3 code with SaxonJS, you can either run the XSLT using SaxonJS.XPath.evaluate calling the XPath 3.1 transform function, as shown below, or you can first use the xslt3 command line tool to export the XSLT to SEF/JSON which can then be run using SaxonJS.transform.

const SaxonJS = require("saxon-js");

const xml = `<?xml version="1.0" encoding="utf-8"?>
<root>
  <items count="1">
    <item>
      <foo>foo 1</foo>
      <bar>bar 1</bar>
    </item>
  </items>
</root>`;

const xsd = `<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="3.0"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  exclude-result-prefixes="#all"
  expand-text="yes">

  <xsl:output method="xml" indent="yes"/>
  
  <xsl:key name="element-by-name" match="xs:element" use="QName(/*/@targetNamespace, @name)"/>
  
  <xsl:key name="attribute-by-name" match="xs:attribute" use="QName(/*/@targetNamespace, @name)"/>
  
  <xsl:template match="*[not(key('element-by-name', node-name(), $schema-doc))]">
    <element-not-declared>
      <name>{node-name()}</name>
      <path>{path()}</path>
    </element-not-declared>
    <xsl:next-match/>
  </xsl:template>
  
  <xsl:template match="@*[not(key('attribute-by-name', node-name(), $schema-doc))]">
    <attribute-not-declared>
      <name>{node-name()}</name>
      <path>{path()}</path>
    </attribute-not-declared>
  </xsl:template>

  <xsl:mode on-no-match="shallow-skip"/>

  <xsl:template match="/" name="xsl:initial-template">
    <xsl:next-match/>
    <xsl:comment xmlns:saxon="http://saxon.sf.net/">Run with {system-property('xsl:product-name')} {system-property('xsl:product-version')} {system-property('Q{http://saxon.sf.net/}platform')}</xsl:comment>
  </xsl:template>
  
  <xsl:param name="schema-doc">
    <xs:schema>
      <xs:element name="root">
        <xs:complexType>
          <xs:sequence>
            <xs:element name="items">
              <xs:complexType>
                <xs:sequence>
                  <xs:element name="item" maxOccurs="unbounded">
                    <xs:complexType>
                      <xs:element name="foo" type="xs:string"/>
                    </xs:complexType>
                  </xs:element>
                </xs:sequence>
              </xs:complexType>
            </xs:element>
          </xs:sequence>
        </xs:complexType>
      </xs:element>
    </xs:schema>
  </xsl:param>

</xsl:stylesheet>`;


const result = SaxonJS.XPath.evaluate(`
  transform(
    map {
      'source-node' : parse-xml($xml),
      'stylesheet-text' : $xsd,
      'delivery-format' : 'serialized'
      }
  )?output`,
  [],
  {
    params : { xml : xml, xsd : xsd }
  }
);

console.log(result);

Output

<?xml version="1.0" encoding="UTF-8"?>
<attribute-not-declared>
   <name>count</name>
   <path>/Q{}root[1]/Q{}items[1]/@count</path>
</attribute-not-declared>
<element-not-declared>
   <name>bar</name>
   <path>/Q{}root[1]/Q{}items[1]/Q{}item[1]/Q{}bar[1]</path>
</element-not-declared>
<!--Run with Saxon-JS 2.3 Node.js-->

Upvotes: 0

Related Questions