Paul Meems
Paul Meems

Reputation: 3284

How to set encoding when stream blob from Azure Storage

I have an XML file in my blob storage. It contains words like this: Družstevní. When I download the XML using Azure portal, this word is still correct.
But when I try using DownloadToStreamAsync the result is Dru�stevn�.

How do I fix this?

I found DownloadTextAsync is working because I get set the encoding: Encoding.GetEncoding(1252).
But then I end up with a string and the rest of my code is expecting a stream. Should I read the string again as a stream or exists a more elegant option?

Here's my code:

public Task<string> DownloadAsTextAsync(string code, Encoding encoding)
{
    var blockBlob = _container.GetBlockBlobReference(code);
    var blobRequestOptions = new BlobRequestOptions
    {
        MaximumExecutionTime = TimeSpan.FromMinutes(15),
        ServerTimeout = TimeSpan.FromHours(1)
    };

    return blockBlob.DownloadTextAsync(Encoding.GetEncoding(1252), null, blobRequestOptions, null);
}

public async Task<Stream> DownloadAsStreamAsync(string code)
{
    var blockBlob = _container.GetBlockBlobReference(code);
    var blobRequestOptions = new BlobRequestOptions
        {
            MaximumExecutionTime = TimeSpan.FromMinutes(15),
            ServerTimeout = TimeSpan.FromHours(1)
        };
    var output = new MemoryStream();
    await blockBlob.DownloadToStreamAsync(output, null, blobRequestOptions, null);
    return output;
}

Edit, after comment of Zhaoxing Lu:
I changed my unit test and added the encoding to StreamReader and now the unit test is passing:

using (var streamReader = new StreamReader(stream, Encoding.GetEncoding(1252)))
{
    string line;
    while ((line = streamReader.ReadLine()) != null)
    {
        if (!line.StartsWith("            <Str>Dru")) continue;

        Debug.WriteLine(line);
        var street = line.Trim().Replace("<Str>", "").Replace("</Str>", "");
        Assert.AreEqual("Družstevní", street);
    }
}

But in my 'real' code I'm sending the stream to load as XML:

fileStream.Position = 0;
var xmlDocument = XDocument.Load(fileStream);

The resulting xmlDocument is in the wrong encoding. I can't find how to set the encoding.

Upvotes: 0

Views: 1759

Answers (1)

Joey Cai
Joey Cai

Reputation: 20067

The problem seems to be when reading the stream as an XDocument

You could set the encoding as Encoding.GetEncoding("Windows-1252") with the following code to read the stream as XDocument.

XDocument xmlDoc = null;

using (StreamReader oReader = new StreamReader(stream, Encoding.GetEncoding("Windows-1252")))
{
    xmlDoc = XDocument.Load(oReader);
}

The result: enter image description here

Upvotes: 1

Related Questions