Reputation: 1785
I am using XmlDocument.Load to load the contents of an XML file that has some characters in Thai. The application is erroring out with the following exception.
System.Xml.XmlException: Invalid character in the given encoding. Line 2, position 82. at System.Xml.XmlTextReaderImpl.Throw(Exception e) at System.Xml.XmlTextReaderImpl.InvalidCharRecovery(Int32& bytesCount, Int32& charsCount) at System.Xml.XmlTextReaderImpl.GetChars(Int32 maxCharsCount) at System.Xml.XmlTextReaderImpl.ReadData() at System.Xml.XmlTextReaderImpl.ParseText(Int32& startPos, Int32& endPos, Int32& outOrChars) at System.Xml.XmlTextReaderImpl.FinishPartialValue() at System.Xml.XmlTextReaderImpl.get_Value() at System.Xml.XmlLoader.LoadNode(Boolean skipOverWhitespace) at System.Xml.XmlLoader.LoadDocSequence(XmlDocument parentDoc) at System.Xml.XmlDocument.Load(XmlReader reader)
The XML file begins with this content
Notice the strange character before the closing tag. This content is coming from a third-party and I don't have access to the file/content.
My questions are:
Upvotes: 2
Views: 4701
Reputation: 12811
If you are very sure that they are Thai characters, Then try correct data encoding in Load.
For Thai the Character encoding is - ISO 8859-11
So could you please try below way of doc load:
xmlDoc.Load(new StreamReader(File.Open("YourXMLFile.xml"),
Encoding.GetEncoding("iso-8859-11")));
Answer to first question, you may need to talk to the third party and ask them to look into their source code to find out why those unwanted characters are appearing in the generated XML.
Upvotes: 2
Reputation: 6864
The data supplied by the third party is not valid XML. I think there's only two solutions i.e. Get the third party to supply valid XML or strip the invalid characters from the XML and process what you can. You could do this...
string invalidXML = File.ReadAllText(path);
var validXml = invalidXML.Where(ch => XmlConvert.IsXmlChar(ch)).ToArray()
if (validXml != invalidXML)
// log the invalid
// process (what you can in) the validXml
Upvotes: 0