neilsimp1
neilsimp1

Reputation: 1259

XML Validation - If Then

I'm trying to find the best way of validating the below XML document. I've looked into three ways of doing this so far:

DTD - doesn't look like there's enough options to make it happen.

XSD - Almost works, but I can't seem to find a way of doing something like If Then

XSL - About the same problem, I can do If's, but I can't seem to figure a way to make it work against this document.

The idea is this: The root element is <Answers>. Underneath is 1 or more <Answer>. These have an ID attribute marking the question number (handled elsewhere, not important) it corresponds to. Each question may have 0 or more answers, which are being written now as <A1>, <A2>... Each answer has 1 or more elements that go into it, <E1>, <E2>...

An example document looks like:

<Answers>

  <Answer ID="1a">
    <A1>
      <E1>Element 1</E1>
      <E2>Element 2</E2>
      <E3>Element 3</E3>
      <E4>Element 4</E4>
    </A1>
    <A2>
      <E1>Element 1</E1>
      <E2>Element 2</E2>
      <E3>Element 3</E3>
      <E4>Element 4</E4>
    </A2>
  </Answer>

  <Answer ID="1b">
    <A1>
      <E1>Element A</E1>
      <E2>Element B</E2>
      <E3>Element C</E3>
    </A1>
  </Answer>

  <Answer ID="2">
    <A1>
      <E1>Element 1</E1>
      <E2>Element 2</E2>
      <E3>Element 3</E3>
      <E4>Element 4</E4>
      <E5>Element 5</E5>
    </A1>
    <A2>
      <E1>Element 1</E1>
      <E2>Element 2</E2>
      <E3>Element 3</E3>
      <E4>Element 4</E4>
      <E5>Element 5</E5>
    </A2>
    <A3>
      <E1>Element 1</E1>
      <E2>Element 2</E2>
      <E3>Element 3</E3>
      <E4>Element 4</E4>
      <E5>Element 5</E5>
    </A3>

  </Answer>
</Answers>

What goes into the Elements is any string, the content is not important, as long is it is not empty.

What I need to validate for is that for each Answer (denoted by ID), for each <A*>, that there is a certain amount of elements. In this example, 1a has 4 elements, 1b has 3, and 2 has 5.

The format is flexible, so if need be I can change things to something like: Element A Element B Element C or something along those lines.

I've tried all sorts of combinations with this, but I can't seem to find a way of requiring X amount of <E*>'s depending on Answer ID.

Does anyone have any thoughts, or can they point me in the right direction, even if it's just saying "Yes, XSD can do this", or "No, XSL cannot do this"?

Upvotes: 2

Views: 712

Answers (1)

Mathias M&#252;ller
Mathias M&#252;ller

Reputation: 22617

The type of validation you'd like to do (an "if then" statement) is called assertion. As pointed out by @kjhughes, XML Schema 1.1 has assertions and you could implement such complex relationships in the XML document with XSD 1.1.

Alternatively, you could use an XSLT transformation that does not really perform validation, but, in a way, transforms your input XML into only the cases where your constraints are not satisfied. Still, XSLT is not a validation language.

I assume someone is eventually going to contribute an XSLT answer - so I would like to suggest something else: Schematron. The following Schematron rules are assertions that define exactly the constraints you need.

<?xml version="1.0" encoding="UTF-8"?>
<schema xmlns="http://purl.oclc.org/dsdl/schematron" queryBinding="xslt2">

    <pattern>
        <title>Content of the answer element.</title>
        <rule context="Answer">
            <assert test="*[starts-with(name(),'A')]" xml:lang="en">The <name/> element must contain at least 1 A* element.</assert>
        </rule>
    </pattern>

    <pattern>
        <title>Count number of child elements E* of elements A*</title>
        <rule context="Answer[@ID = '1a']/*[starts-with(name(),'A')]">
            <assert test="count(*[starts-with(name(),'E')]) = 4" xml:lang="en">If the @ID attribute of the Answer element is "1a",
                then all child elements A* must have exactly 4 child elements E*.</assert>
        </rule>
        <rule context="Answer[@ID = '1b']/*[starts-with(name(),'A')]">
            <assert test="count(*[starts-with(name(),'E')]) = 3" xml:lang="en">If the @ID attribute of the Answer element is "1b",
                then all child elements A* must have exactly 3 child elements E*.</assert>
        </rule>
        <rule context="Answer[@ID = '2']/*[starts-with(name(),'A')]">
            <assert test="count(*[starts-with(name(),'E')]) = 5" xml:lang="en">If the @ID attribute of the Answer element is "2",
                then all child elements A* must have exactly 5 child elements E*.</assert>
        </rule>
    </pattern>

</schema>

If the input document contains the following structure:

<Answer ID="1a">
    <A1>
        <E1>Element 1</E1>
        <E2>Element 2</E2>
        <E4>Element 4</E4>
    </A1>
</Answer>

the validating application would complain, saying that

E [ISO Schematron] If the @ID attribute of the Answer element is "1a", then all child elements A* must have exactly 4 child elements E*.

Schematron files can be interpreted with the Schematron Reference Implementation - an XSLT stylesheet - or environments like Oxygen.


Aside note: It would not be too difficult to translate the SCH rules above to XSLT, because they are very similar. You'd have to replace the following, for instance:

SCH                                 | XSLT equivalent
-----------------------------------------------------------------------------
<sch:rule context="...">            | <xsl:template match="...">
<sch:assert test="*">               | <xsl:if test="not(*)">
<sch:report test="*">               | <xsl:if test="*">

Upvotes: 3

Related Questions