Reputation: 615
I have defined a unique constraint on multiple elements : define unique constraint based on multiple elements
Now unique constraint looks like this:
<xs:unique name="specieSizeGroupLengthAssortment">
<xs:selector xpath="DataRow"/>
<xs:field xpath="Specie"/>
<xs:field xpath="Group"/>
<xs:field xpath="Length"/>
<xs:field xpath="Type"/>
</xs:unique>
Now imagine the element "Type" is optional. So far my search and my testing confirmed, that this unique constraint only works on elements which have all the subelements defined in the unique constraint. For example:
This should be invalid due to unique constraint:
<DataRow>
<Specie>A</Specie>
<Length>100</Length>
<Group>A</Group>
</DataRow>
<DataRow>
<Specie>A</Spacie>
<Length>100</Length>
<Group>A</Group>
</DataRow>
This should be valid :
<DataRow>
<Specie>A</Specie>
<Length>100</Length>
<Group>A</Group>
</DataRow>
<DataRow>
<Specie>A</Spacie>
<Length>100</Length>
<Group>A</Group>
<Type>D</Type>
</DataRow>
This should be invalid :
<DataRow>
<Specie>A</Specie>
<Length>100</Length>
<Group>A</Group>
<Type>D</Type>
</DataRow>
<DataRow>
<Specie>A</Spacie>
<Length>100</Length>
<Group>A</Group>
<Type>D</Type>
</DataRow>
Is it possible to create an XSD schema that will do this kind of validation?
Upvotes: 3
Views: 3521
Reputation: 163468
I think I got it the wrong way around. Unique constraints allow absent fields; Key constraints do not.
The language is very obscure, but is easier to understand in the XSD 1.1 version because some notes have been added. I don't think there is any (intentional) change in functionality between the two versions.
The {selector}, with the element information item as the context node, evaluates to a node-set (as defined in [XPath]). [Definition:] Call this the target node set.
Call the subset of the ·target node set· for which all the {fields} evaluate to a node-set with exactly one member which is an element or attribute node with a simple type the qualified node set.
So if some selected node has a missing value for one of its fields, then this node is not part of the qualified node-set.
This means that for "unique", selected nodes for which a field is absent are simply ignored.
This means that for "key", the data is invalid if one of the fields is missing.
I'm left concluding that the original schema as posted almost does what is required, except that the first example is not invalid: for both selected nodes there is a field missing, therefore neither selected node is included in the qualified node-set, therefore the unique constraint has no effect. To make this invalid, you will need a second "unique" constraint that only lists the first three fields. But then you will get a validity error if these three fields are the same, even if the fourth field is present.
In XSD 1.1 of course you can solve the problem with an assertion, along the lines
test="count(DataRow) = count(distinct-values(DataRow/concat(
Specie, '|', Length, '|', Group, '|', Type)))
Upvotes: 2
Reputation: 122394
The specification states that each field
in a unique
constraint
must identify a single node (element or attribute) whose content or value, which must be of a simple type, is used in the constraint.
XML Schema part 1: Structures, §3.11.1, my bold.
So it appears that you can't use optional elements in a uniqueness constraint. This is backed up by the step-by-step rules for validating these constraints (§3.11.4):
3 For each node in the ·target node set· all of the {fields}, with that node as the context node, evaluate to either an empty node-set or a node-set with exactly one member, which must have a simple type. [Definition:] Call the sequence of the type-determined values (as defined in [XML Schemas: Datatypes]) of the [schema normalized value] of the element and/or attribute information items in those node-sets in order the key-sequence of the node.
4 [Definition:] Call the subset of the ·target node set· for which all the {fields} evaluate to a node-set with exactly one member which is an element or attribute node with a simple type the qualified node set. The appropriate case among the following must be true:
4.1 If the {identity-constraint category} is unique, then no two members of the ·qualified node set· have ·key-sequences· whose members are pairwise equal, as defined by Equal in [XML Schemas: Datatypes].
[...]
This explicitly defines the uniqueness check as applying only to the "qualified node set", i.e. those nodes matching the selector
which have values for all their field
s
Upvotes: 3