Reputation: 2785
In my project, uploading an item to storage is a two step process.
Step 1 - I upload the blob to Azure Storage.
Step 2 - I then run a query on the blob index tags and retrieve a list of blobs (including the one I just uploaded in step 1)
I'm doing this becuase I'm creating a folder directory system, whereby the user can create their own folders & subfolders and the blobs will be referenced to this folder structure. In essence a way of creating a similar look and feel to the Windows File Explorer for organsizing the uploads.
I've had to factor in a short delay of a couple of seconds in between steps 1 & 2, otherwise if I try to query the storage straight after the upload has completed, the newly added blob does not get returned.
I'm not sure how best to go about this becuase adding static delays between the two steps in my code seems a dangerous approach in case the delay is not long enough in some scenarios.
Step 1 method:
public async Task UploadContentAsync(UploadContentRequest request)
{
try
{
// Get the BlobContainerClient & Container
var container = _blobServiceClient.GetBlobContainerClient(request.ContainerName);
// Create a container if not exist.
await container.CreateIfNotExistsAsync();
var blobClient = container.GetBlobClient(request.FileName);
var bytes = Encoding.UTF8.GetBytes(request.Content);
await using var memoryStream = new MemoryStream(bytes);
await blobClient.UploadAsync(memoryStream, new BlobHttpHeaders { ContentType = request.FileName.GetContentType() });
if (request.BlobIndexTags != null)
{
// Set or update blob index tags on existing blob
Dictionary<string, string> tags = new()
{
{ "parentFolder", request.BlobIndexTags.ParentFolder }
};
await blobClient.SetTagsAsync(tags);
}
if (request.BlobMetadata != null)
{
IDictionary<string, string> metadata = new Dictionary<string, string>
{
// Add metadata to the dictionary by calling the Add method
{ "description", request.BlobMetadata.Description }
};
// Set the blob's metadata.
await blobClient.SetMetadataAsync(metadata);
}
}
catch (RequestFailedException ex)
{
LogExtension logExtension = new(_config);
string logData = "AzureBlobStorage-API encountered an exception when uploading content with name [" + request.FileName + "]";
logExtension.WriteLogEvent<BlobService>("Error", "Blob Storage", "Upload Content", request.FileName, logData, ex);
throw;
}
}
In the above method, I fist upload the content (in my case is a short string) and then I add some Index Tags and finally set some Metadata.
Step 2 method:
public async Task<List<TaggedBlobItem>> GetBlobsWithQueryAsync(string query)
{
try
{
var blobs = new List<TaggedBlobItem>();
await foreach (TaggedBlobItem taggedBlobItem in _blobServiceClient.FindBlobsByTagsAsync(query))
{
blobs.Add(taggedBlobItem);
}
return blobs;
}
catch (RequestFailedException ex)
{
LogExtension logExtension = new(_config);
string logData = "AzureBlobStorage-API encountered an exception when fetching blobs with query [" + query + "]";
logExtension.WriteLogEvent<BlobService>("Error", "Blob Storage", "Get Blobs with Query", query, logData, ex);
throw;
}
}
The above method is run straight after the first method completes. Both steps run perfectly fine if I add a short delay in between them, currently set to 2 seconds.
My question is not about the code specifically, but more about trying to understand if delays are generally expected when using the SDK to upload a blob. I'm somewhat a little puzzled considering that I'm awaiting the upload to comlete in step 1 using async propgramming before I then try to fetch the same blob (and others) through querying by Index Tags in step 2.
These two methods are located in my API controller microservice, but the actual request for an upload is driven from my front end app using ajax in javascript. Worth noting that I do not attempt to run method 2 until the API results of step 1 has completed in ajax, so I dont beleive the issue stems any fuerther back in my project.
It seems what I'm looking for is some kind of a blob upload completed event to await on, something like 'Blob has uploaded and is ready to be queried'
My guess is that completing an upload of the blob through the SDK doesnt actually mean it's ready to find and retieve straight away through a query as Azure is liekly still processing the upload in the background...
Upvotes: 2
Views: 1833
Reputation: 136146
The behavior you are seeing is not related to the SDK. Rather it is a limitation (in a manner of speaking) of Azure Storage. Whenever tags are set on a blob, they are persisted immediately however there is some delay before which you can search for blobs using tags.
From this link
:
The indexing engine exposes your key-value attributes into a multi-dimensional index. After you set your index tags, they exist on the blob and can be retrieved immediately. It may take some time before the blob index updates. After the blob index updates, you can use the native query and discovery capabilities offered by Blob Storage.
I am not able to find any documentation regarding how much time it would take for the blob indexes to update so that you can search for blobs by tags.
Side Note:
I noticed that you are first uploading the blob and then sending separate requests to update tags and metadata. You do not really have to do that. You can simply set the tags and metadata in your upload method only. Please see BlobUploadOptions
for more details.
Upvotes: 2