Strange results from OpenReadAsync() when reading data from Azure Blob storage

Question

I'm having a go at modifying an existing C# (dot net core) app that reads a type of binary file to use Azure Blob Storage.

I'm using Windows.Azure.Storage (8.6.0).

The issue is that this app reads the binary data from files from a Stream in very small blocks (e.g. 5000-6000 bytes). This reflects how the data is structured.

Example pseudo code:

var blocks = new List(); 
var numberOfBytesToRead = 6240;
var numberOfBlocksToRead = 1700;

using (var stream = await blob.OpenReadAsync())
{
  stream.Seek(3000, SeekOrigin.Begin); // start reading at a particular position
  for (int i = 1; i <= numberOfBlocksToRead; i++)
  {
    byte[] traceValues = new byte[numberOfBytesToRead];
    stream.Read(traceValues, 0, numberOfBytesToRead);
    blocks.Add(traceValues);
  }
}`

If I try to read a 10mb file using OpenReadAsync(), I get invalid/junk values in the byte arrays after around 4,190,000 bytes.

If I set StreamMinimumReadSize to 100Mb it works.
If I read more data per block (e.g. 1mb) it works.

Some of the files can be more than 100Mb, so setting the StreamMinimumReadSize may not be the best solution.

What is going on here, and how can I fix this?

John Rusk - MSFT · Accepted Answer

Are the invalid/junk values zeros? If so (and maybe even if not) check the return value from stream.Read. That method is not guaranteed to actually read the number of bytes that you ask it to. It can read less. In which case you are supposed to call it again in a loop, until it has read the total amount that you want. A quick web search should show you lots of examples of the necessary looping.

Strange results from OpenReadAsync() when reading data from Azure Blob storage

Answers (1)

Related Questions