Alejandro Montilla
Alejandro Montilla

Reputation: 2654

How do I decode a base64 encoded string containing an XML document that contains characters with accents (á,é,í,ó,ú) in C#?

How do I decode a base64 encoded string containing an XML document that contains latin letters (á,é,í,ó,ú)?

I am aware of this question How do I encode and decode a base64 string? But the solutions provided do not work well with letters that have accents.

So far I've tried:

xmlBase64 = System.Text.Encoding.ASCII.GetString(System.Convert.FromBase64String(XmlDoc));
xmlBase64 = System.Text.Encoding.Unicode.GetString(System.Convert.FromBase64String(XmlDoc));
xmlBase64 = System.Text.Encoding.UTF8.GetString(System.Convert.FromBase64String(XmlDoc));
xmlBase64 = System.Text.Encoding.UTF32.GetString(System.Convert.FromBase64String(XmlDoc));

But in all cases the latin letters (spanish characters) are replaced with ? or similars.

EDIT:

This is the base64 encoded string

This is the Decoded string

Upvotes: 3

Views: 4085

Answers (1)

DPenner1
DPenner1

Reputation: 10462

It's helpful to see the bytes produced by System.Convert.FromBase64String(XmlDoc).

I've done that and took a look at the word "metálicas" in your original string (this was just the first word I found with an accent). This portion of the string is converted to the byte array 6D 65 74 E1 6C 69 63 61 73.

From that byte array it's easy to see two things:

  • This is a single byte encoding
  • It is not UTF-8: In UTF-8, bytes greater than 7F never occur on their own, always in groups of 2-4.

From there I guessed it would be some form of extended ASCII, Windows-1252 seems to work. Try the following:

xmlBase64 = System.Text.Encoding.GetEncoding(1252).GetString(System.Convert.FromBase64String(XmlDoc));

Upvotes: 5

Related Questions