Reputation: 12880
I have a slightly bad XML that I'm trying to parse in .NET. This same XML file is consumable by other parsers - that is, they're more tolerant of user error.
The XML looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<kml>
<Document id="12345">
<name>My name</name>
<description>My Description</description>
<myns:author>
<myns:name>My Name</myns:name>
</myns:author>
</Document>
</kml>
I load it like this:
XmlDocument doc = new XmlDocument();
doc.Load(myFilePath);
This second line rightfully throws an exception:
'myns' is an undeclared prefix. Line 6, position 4.
From an application point-of-view, we are acting mostly as a conduit to another application that is able to deal with this slightly wrong XML file. We do not want to reject this XML that this 3rd party application is capable of processing.
Is there a way to disable or modify the strictness of the .NET XML Parser?
Upvotes: 1
Views: 328
Reputation: 163342
All the previous answers, surprisingly, are wrong.
Your document is well-formed XML but it is not namespace-well-formed XML. This means that it conforms to the XML recommendation but not to the namespaces-in-XML recommendation. This means you will be able to parse it if you can find a parser that allows namespace processing to be switched off. I don't know if the Microsoft XML parser has such an option, but I don't see one here:
http://msdn.microsoft.com/en-US/library/9khb6435(v=vs.80).aspx
Upvotes: 2
Reputation: 1500903
Is there a way to disable or modify the strictness of the .NET XML Parser?
Schema validation and things like that are somewhat optional, but this is simply invalid XML. XML parsers generally are this strict, and should be. The fact that the downstream application is capable of handling this is a worrying sign, in itself, IMO.
Options:
myns
namespace prefix being undeclared, you could fix that by declaring it in the root element. You'd probably want to load the file line by line, just changing the second one (the root XML declaration)Upvotes: 6
Reputation: 43168
A conformant XML processor (including the .NET API) does not distinguish between degrees of well-formedness, however "slight." Input is either well-formed, or it's not.
Depending on what you want to do with the document, you have different options for handling it, but all will involve some kind of modification, or System.Xml
and company will be of no use here.
Upvotes: 2