Reputation: 1755
I am trying to get the count of files and files size of each containers but it is very slow. This is the code I am currently using:
var blobServiceClient = new BlobServiceClient(connectionStr);
var blobs = blobServiceClient.GetBlobContainers();
foreach (var blob in blobs)
{
var containerClient = blobServiceClient.GetBlobContainerClient(blob.Name);
var blobItems = containerClient.GetBlobs();
var fileCount = blobItems.Count();
long fileSize = 0;
foreach (var blobItem in blobItems)
{
var blobClient = containerClient.GetBlobClient(blobItem.Name);
var properties = blobClient.GetProperties();
fileSize += properties.Value.ContentLength;
}
var storageInfo = new StorageInformation()
{
Customer_GUID = new Guid(blob.Name),
FileCount = fileCount,
StorageSize = fileSize
};
dbContext.StorageInformation.Add(storageInfo);
await dbContext.SaveChangesAsync();
}
Is there a way to do this faster? I have about 500 containers averaging 40k blobs in each one.
Upvotes: 0
Views: 1669
Reputation: 1506
Thanks @Peter Bons for the comment.
Performance can be improved with the little improvements of code, and timely deallocation of resources after used. And also using best of the indexing azure bobs (Indexing Azure Bolbs)
using parallel programing we can acheive this.
And also by saving only the metadata of the blobs in a database. And fetching from the datbase based on requirement.
Using Parallel foreach loops
Sample Code:
Parallel.ForEach(integerList, i => { long total = DoSomeIndependentTimeconsumingTask(); Console.WriteLine("{0} - {1}", i, total); });
// Sequential version
foreach (var item in sourceCollection)
{
Process(item);
}
// Parallel equivalent
Parallel.ForEach(sourceCollection, item => Process(item));
Blob metadata can be indexed, and it is helpful if you think any custom metadata properties will be useful in filters and queries.
.NET supports for parallel programming by providing a runtime, class library types, and diagnostic tools. These were introduced in .NET Framework 4, which will simplify parallel development. You can write your own custom code without having to work directly with threads or the thread pool.
References:
Upvotes: 1