Reputation: 105
Continuing the saga, here is part I: ContentHash is null in Azure.Storage.Blobs v12.x.x
After a lot of debugging, root cause appears to be that the content hash was not calculated after uploading a blob, therefore the BlobContentInfo
or BlobProperties
were returning a null content hash and my whole flow is based on receiving the hash from Azure.
What I've discovered is that it depends on which HttpRequest stream method I call and upload to azure:
HttpRequest.GetBufferlessInputStream()
, the content hash is not calculated, even if I go into azure storage explorer, the ContentMD5 of the blob is empty.
HttpRequest.InputStream()
everything works as expected.
Do you know why this different behavior? And do you know how to make to receive content hash for streams received by GetBufferlessInputStream
method.
So the code flow looks like this:
var stream = HttpContext.Current.Request.GetBufferlessInputStream(disableMaxRequestLength: true)
var container = _blobServiceClient.GetBlobContainerClient(containerName);
var blob = container.GetBlockBlobClient(blobPath);
BlobHttpHeaders blobHttpHeaders = null;
if (!string.IsNullOrWhiteSpace(fileContentType))
{
blobHttpHeaders = new BlobHttpHeaders()
{
ContentType = fileContentType,
};
}
// retry already configured of Azure Storage API
await blob.UploadAsync(stream, httpHeaders: blobHttpHeaders);
return await blob.GetPropertiesAsync();
In the code snippet from above ContentHash
is NOT calculated, but if I change the way I am getting the stream from the http request with following snippet ContentHash
is calculated.
var stream = HttpContext.Current.Request.InputStream
P.S. I think its obvious, but with the old sdk, content hash was calculated for streams received by GetBufferlessInputStream
method
P.S2 you can find also an open issue on github: https://github.com/Azure/azure-sdk-for-net/issues/14037
P.S3 added code snipet
Upvotes: 2
Views: 2748
Reputation: 16928
Ran into this today. From my digging, it appears this is a symptom of the type of Stream
you use to upload, and it's not really a bug. In order to generate a hash for your blob (which is done on the client side before uploading by the looks of it), it needs to read the stream. Which means it would need to reset the position of your stream back to 0 (for the actual upload process) after generating the hash. Doing this requires the ability to perform the Seek operation on the stream. If your stream doesn't support Seek, then it looks like it doesn't generate the hash.
To get around the issue, make sure the stream you provide supports Seek (CanSeek
). If it doesn't, then use a different Stream/copy your data to a stream that does (for example MemoryStream
). The alternative would be for the internals of the Blob SDK to do this for you.
Upvotes: 1
Reputation: 29950
A workaround is that when get the stream via GetBufferlessInputStream()
method, convert it to MemoryStream
, then upload the MemoryStream
. Then it can generate the contenthash
. Sample code like below:
var stream111 = System.Web.HttpContext.Current.Request.GetBufferlessInputStream(disableMaxRequestLength: true);
//convert to memoryStream.
MemoryStream stream = new MemoryStream();
stream111.CopyTo(stream);
stream.Position = 0;
//other code
// retry already configured of Azure Storage API
await blob.UploadAsync(stream, httpHeaders: blobHttpHeaders);
Not sure why, but as per my debug, I can see when using the method GetBufferlessInputStream()
in the latest SDK, during upload, it actually calls the Put Block api in the backend. And in this api, MD5 hash is not stored with the blob(Refer to here for details.). Screenshot as below:
However, when using InputStream
, it calls the Put Blob api. Screenshot as below:
Upvotes: 0