Izzy Rodriguez
Izzy Rodriguez

Reputation: 2185

Azure BLOB possible bug - Random wrong file

So, I know it is kinda crazy to report bug at this point in Azure life cycle, but I'm out of options. Here we go.

We have a service that you can upload files and a client that download then. That BLOB is stuffed with about 27 GB of data.

In a few occasions our users reported that some files were coming wrong, so we checked our MVC route to see if was anything wrong and found nothing.

So we created a simple console that loop the download:

public static void Main()
{

    var firstHash = string.Empty;
    var client = new System.Net.WebClient();
    for (int i = 0; i < 5000; i++)
    {
        try
        {
            var date = DateTime.Now.ToString("HH-mm-ss-ffff");

            var destination = @"C:\Users\Israel\Downloads\RO65\BLOB - RO65 -" + date + ".rfa";
            client.DownloadFile("http://myboxfree.blob.core.windows.net/public/91fe9d90-71ce-4036-b711-a5300159abfa.rfa", destination);

            string hash = string.Empty;
            using (var md5 = MD5.Create())
            {
                using (var stream = File.OpenRead(destination))
                {
                    hash = Convert.ToBase64String(md5.ComputeHash(stream));
                }
            }

            if (string.IsNullOrEmpty(firstHash))
                firstHash = hash;

            if (hash != firstHash) hash += " ---------------------------------------------";
            Console.WriteLine("i: " + i.ToString() + " = " + hash);
        }
        catch { }
    }
}

So here is the result - every now and then it downloads the wrong file:

enter image description here

The first 1000 downloads were OK, the right file. Out of the blue the BLOB returns a different file, and then goes back to normal.

The only relation I found between the files are the extension and the file size in bytes. The hash is (of course) different.

Any thoughts?

Upvotes: 3

Views: 347

Answers (1)

Jason Hogg - MSFT
Jason Hogg - MSFT

Reputation: 1378

I have tried to rerun your sample code and wasn't able to repro.

Questions:

  • For the two different versions of the files you are seeing downloaded have you compared the contents of the two files? I think you said it was two completely different blobs being retrieved - however I wanted to verify that. How large is the delta between the two files?
  • Are you using RA-GRS and the client libraries read from secondary retry condition - meaning a network glitch could result in the read coming from the secondary region?

Suggestions:

  • Can you track the etag of the retrieved files. This allows you to check if the blob has changed since you first started reading it?
  • The Storage Service does enable you to explicitly validate the integrity of your objects to check to see if they have been modified in transit - potentially due to network issues etc. See Azure Storage Md5 Overview for more information. The simplest way however might just be to use https as these validations are already built into https.
  • Can you also try to repro using https and let me know if that helps?

Upvotes: 4

Related Questions