erman8
erman8

Reputation: 2241

Unable to open/read XML file containing accented characters

I have an XML attribute that containers an accent characters:

 <TestCase Name="Canadian Addresses - Test Case" Description="Canadian Addresses - Test Case" OnOff="True" NegativeTest="False" RollbackDB="False" Performance="False" PerformanceSummary="False" TimesToExecute="1">

 <ProviderFacilitySearch_FindProviderFacility ProviderInfo="Dr Marc-André Kärcher Samuels Senior|10 Château du Feÿ Ave, North Building, North Sydney, NS  B2A 3L7 CANADA" />
 <ProviderFacilitySearch_ViewProviderFacility ProviderInfo="Dr Marc-André Kärcher Samuels Senior|10 Château du Feÿ Ave, North Building, North Sydney, NS  B2A 3L7 CANADA" />
 <ViewProvider_LocationName ExpectedLocationName="Kärcher Health Care" />
 <ViewProvider_ServicingAddress ExpectedServicingAddress="10 Château du Feÿ Ave|Central Building|North Sydney, NS  B2A 3L7|CANADA" />
 <ViewProvider_ExpandMailingAddress NA="" />
 <ViewProvider_MailingAddress ExpectedMailingAddress="10 Château du Feÿ Ave|Central Building|North Sydney, NS  B2A 3L7|CANADA" />
 <ViewProvider_ExpandBillingAddress NA="" />
 <ViewProvider_BillingAddress ExpectedBillingAddress="10 Château du Feÿ Ave|Central Building|North Sydney, NS  B2A 3L7|CANADA" />
 <ViewProvider_Close NA="" />
 <ProviderFacilitySearch_Cancel NA="" />
 <UserLogout/>
 </TestCase>

When I try to read this XML file using C# code, I'm getting:

5/8/2013 2:39:03 PM ERROR: System.Xml.XmlException: Invalid character in the given encoding. Line 86, position 74. at System.Xml.XmlTextReaderImpl.Throw(Exception e) at System.Xml.XmlTextReaderImpl.Throw(String res, String arg) at System.Xml.XmlTextReaderImpl.InvalidCharRecovery(Int32& bytesCount, Int32& charsCount)

I can't even open the page using IE.

Is there a way to get this work?

Upvotes: 4

Views: 6768

Answers (2)

Logar314159
Logar314159

Reputation: 503

Use this header:

<?XML VERSION='1.0' ENCODING='ISO-8859-1'?>

Edit

The encoding declaration identifies which encoding is used to represent the characters in the XML document. Although XML parsers can determine automatically if a document uses the UTF-8 or UTF-16 Unicode encoding, this declaration should be used in documents that support other encodings.

Upvotes: 5

Sphinxxx
Sphinxxx

Reputation: 13017

To see if the file is actually encoded the way its header says it is, use a hex reader (e.g. HxD) to see the stored bytes.

If the file is UTF8, you should see something similar to this:

(...)  ProviderInfo="Dr Marc-André Kärcher Samuels Senior|10 Château du Feÿ Ave (...)

Upvotes: 0

Related Questions