Reputation: 6500
I've explored various types of xml validation with xsd by using XmlDocument.Validate(ValidationEventHandler)
, XDocument.Validate(schemas, ValidationEventHandler)
and XmlReader
with a schema passed to it that sends results to a ValidationEventHandler
callback.
However the callback practically provides only the severity and an error string. Am receiving such messages like:
The 'name' attribute is invalid - The value '' is invalid according to
its datatype 'TNonEmptyStringNoWhitespacesAtBeginningAndEnd' - The
Pattern constraint failed.
Now this far from an ideal error messages. The callback args do not provide what parent caused this, nor what XML line it is or anything practical.
In my scenario not all names are of the given type above, some of them simply can be empty strings (as they are optional).
Now having probably hundreds of xml nodes with names makes it very annoying to locate the issue above as there is no context information about the location, not even what the xml node is.
How can the verbosity of such a validation be extended? Notepad++ for instance uses a XML Tools plugin that outputs the message above as:
Validation of current file using XML schema:
ERROR: Element 'LightSource', attribute 'name': [facet 'minLength'] The value '' has a length of '0'; this underruns the allowed minimum length of '1'.
ERROR: Element 'LightSource', attribute 'name': [facet 'pattern'] The value '' is not accepted by the pattern '.*\S'.
ERROR: Element 'LightSource', attribute 'name': '' is not a valid value of the atomic type 'TNonEmptyStringNoWhitespacesAtBeginningAndEnd'.
This is more verbose and indicates at least some context information like the issue appears on a LightSource Element and what exactly wnt wrong with the underlying type.
Are there other facilities allowing a proper C# XSD validation with increased context information?
The validations were done on in-memory-representation of XML in terms of XDocument
and XmlDocument
as well read from file with XmlReader
. Obviously line numbers etc would make sense only in a context where an xml file was already written but other information like parent element etc would be handy so I could at least output the xml context where to look at.
For the sake of completeness some code:
var schemas = new XmlSchemaSet();
schemas.Add("", xsdPath);
var doc = XDocument.Load(xmlFile);
doc.Validate(schemas,ValidationEventHandler);
public void ValidationEventHandler(object sender, ValidationEventArgs e)
{
// Not much in e
switch (e.Severity)
{
case XmlSeverityType.Error:
Console.WriteLine("Error: {0}", e.Message);
break;
case XmlSeverityType.Warning:
Console.WriteLine("Warning {0}", e.Message);
break;
}
}
Another attempt that looked promising was http://msdn.microsoft.com/en-us/library/as3tta56%28v=vs.110%29.aspx but did not increase any verbosity at all.
I have some type which forms a constraint:
<xs:simpleType name="TNonEmptyStringNoWhitespacesAtBeginn">
<xs:restriction base="xs:string">
<xs:pattern value="\S.*" />
<xs:minLength value="1"/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="TNonEmptyStringNoWhitespacesAtBeginningAndEnd">
<xs:restriction base="TNonEmptyStringNoWhitespacesAtBeginn">
<xs:pattern value=".*\S" />
<xs:minLength value="1"/>
</xs:restriction>
</xs:simpleType>
Ignore the TNonEmptyStringNoWhitespacesAtBeginn
it is a helper to allow AND-ing restrictions. So when I have an attribute name
with the type above that is just an empty string I get very different amount of information from C#s XSD validation and from what Notepads++ XML Tools Plugin does.
Here are the different messages for the sake of completeness again:
C#
The 'name' attribute is invalid - The value '' is invalid according to
its datatype 'TNonEmptyStringNoWhitespacesAtBeginningAndEnd' - The
Pattern constraint failed.
Notepad++
ERROR: Element 'LightSource', attribute 'name': [facet 'minLength'] The value '' has a length of '0'; this underruns the allowed minimum length of '1'.
ERROR: Element 'LightSource', attribute 'name': [facet 'pattern'] The value '' is not accepted by the pattern '.*\S'.
ERROR: Element 'LightSource', attribute 'name': '' is not a valid value of the atomic type 'TNonEmptyStringNoWhitespacesAtBeginningAndEnd'.
With the information provided by the exception contents I can retrieve the XML Element and display it but saying that constraint for TNonEmptyStringNoWhitespacesAtBeginningAndEnd
failed is much less expressive than telling me what part in detail has failed. I know I get the hint, that the pattern constraint failed but anybody who gets such a message needs to locate the type and inspect its constraints to gain knowledge about the constraint. By inspecting the data from the exception it seems like this is the level of detail here.
XML Tools Plugin seems to have the ability to expose each validation item and with much more detail. This is nothing just inferred from the XSD, it rather looks like information obtained by the processing step of each constraint.
What I hoped for was a way to increase the verbosity of the validator to get more information.
Upvotes: 3
Views: 2990
Reputation: 21658
Re: line numbers... For XDocument, if you enable the line info capture
XDocument xdoc = XDocument.Load(reader, LoadOptions.PreserveWhitespace | LoadOptions.SetLineInfo | LoadOptions.SetBaseUri);
then your validation handler would extract that something like this in your posted ValidationEventHandlercode (IXmlLineInfo):
IXmlLineInfo node = sender as IXmlLineInfo;
if (node != null && node.HasLineInfo()) ...
This should cover the info you wanted...
For traditional DOM, you have the option to inspect the Exception property (gives you LineNumber and LinePosition), in theory at least, through the Exception property you could also get the SchemaObjectProperty. In all my code I am using XDocument, and that works fine for sure.
This should get you started to at least provide a better location in terms of line/position (which would work even if it is in memory).
(Updates based on modified question)
C# won't give you what you see with the plugin you're referring to... for me it is an implementation choice. XSD facets work in conjunction; therefore, any to fail deems the whole invalid.
.NET's built-in XSD validator is a general purpose, without too many validation tweaks (the only one is to do or not Unique Particle Attribution). To balance performance, the above happens for simple type validations.
The plugin seems to be designed for interactivity... it seems to want to tell as much as it can, no matter what it takes...
Upvotes: 4