ROBERT RICHARDSON
ROBERT RICHARDSON

Reputation: 2299

Why does a space break validation here?

In my XML schema, I have created a type named NonEmptyString. It is supposed to reject any value that is null or consists of nothing but whitespace. I've turned that around to say that it should accept anything that has at least one non-whitespace character. That should include anything with whitespace between two non-whitespace characters. However, it is rejecting "BATCH ANNEAL" while accepting "BATCH_ANNEAL".

In case it matters, I'm going to be using this schema in a Python 3 script, although this XML validator rejects it as well.

Here is the XML Schema definition:

<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">

  <xs:simpleType name="NonEmptyString">
    <xs:restriction base="xs:string">
      <xs:pattern value="\S+" />
    </xs:restriction>
  </xs:simpleType>

 <xs:element name="MESSAGE">
    <xs:complexType>
      <xs:sequence>
      <xs:element type="xs:short" name="MESSAGE_NUMBER"/>
      <xs:element type="NonEmptyString" name="MESSAGE_TYPE"/>
      <xs:element type="NonEmptyString" name="PLANT_CODE"/>
      <xs:element type="NonEmptyString" name="PLANT_TEXT"/>
      <xs:element type="xs:dateTime" name="TIMESTAMP"/>
      <xs:element type="NonEmptyString" name="SIMULATION_INDEX"/>
    </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>

Here is the element I'm trying to validate against it.

<MESSAGE>
    <MESSAGE_NUMBER>2601</MESSAGE_NUMBER>
    <MESSAGE_TYPE>MaterialData</MESSAGE_TYPE>
    <PLANT_CODE>ANBA</PLANT_CODE>
    <PLANT_TEXT>BATCH ANNEAL</PLANT_TEXT>
    <TIMESTAMP>2016-03-01T08:54:53</TIMESTAMP>
    <SIMULATION_INDEX>N</SIMULATION_INDEX>
</MESSAGE>

Upvotes: 1

Views: 188

Answers (2)

kjhughes
kjhughes

Reputation: 111726

Here's an alternative to @Tomalak's (fine, +1) regex-based solution. This approach uses xs:minLength and xs:whiteSpace facets instead of an xs:pattern regex:

<xs:simpleType name="NonEmptyString">
   <xs:restriction base="xs:string">
    <xs:minLength value="1" />
    <xs:whiteSpace value='collapse'/>
   </xs:restriction>
</xs:simpleType>

Upvotes: 1

Tomalak
Tomalak

Reputation: 338386

The pattern always must match the entire value. \S+ matches "BATCH_ANNEAL" but it does not match "BATCH ANNEAL".

Try

<xs:pattern value="\S+|\S.*\S" />

to enforce values that either are completely non-whitespace or begin and end with a non-whitespace character. Use something more specific than . if necessary.

Upvotes: 2

Related Questions