Byron Brummer
Byron Brummer

Reputation: 61

Azure blob download is incredibly slow using PowerShell (via Get-AzureStorageBlobContent), but fast via Azure Explorer, etc?

With very basic code that simply loops through my storage account and mirrors all containers and blobs to my local disk, I'm finding the Get-AsureStorageBlobContent cmdlet to be incredibly slow? It seems to take a real time second or two per blob regardless of the blob size...which adds considerable overhead when we've got thousands of tiny files.

In contrast, on the same machine and network connection (even running simultaneously), Azure Explorer does the same bulk copy 10x to 20x faster, and AzCopy does it literally 100x faster (async), so clearly it's not a network issue.

Is there a more efficient way to use the Azure storage cmdlets, or are they just dog slow by nature? The help for Get-AzureStorageContainer mentions a -ConcurrentTaskCount option which implies some ability to be async, but there's no documentation on how to achieve async and given that it only operates on a single item I'm not sure how it could?

This is the code I'm running:

$localContent       = "C:\local_copy"
$storageAccountName = "myblobaccount"
$storageAccountKey  = "mykey"

Import-Module Azure    

$blob_account = New-AzureStorageContext -StorageAccountName $storageAccountName -StorageAccountKey $storageAccountKey -Protocol https

Get-AzureStorageContainer -Context $blob_account | ForEach-Object {
    $container = $_.Name

    Get-AzureStorageBlob -Container $container -Context $blob_account | ForEach-Object {
        $local_path = "$localContent\{0}\{1}" -f$container,$_.Name

        $local_dir = Split-Path $local_path
        if (!(Test-Path $local_dir)) {
            New-Item -Path $local_dir -ItemType directory -Force
        }
        Get-AzureStorageBlobContent -Context $blob_account -Container $container -Blob $_.Name -Destination $local_path -Force | Out-Null
    }
}

Upvotes: 3

Views: 5018

Answers (2)

Gaurav Mantri
Gaurav Mantri

Reputation: 136146

I looked at the source code for Get-AzureStorageBlobContent on Github and found certain interesting things which may cause the slowness of downloading blobs (especially smaller sized blobs):

Line 165:

ICloudBlob blob = Channel.GetBlobReferenceFromServer(container, blobName, accessCondition, requestOptions, OperationContext);

What this code does is that it makes a request to the server to fetch blob type. So you add one extra request to the server for each blob.

Line 252 - 262:

        try
        {
            DownloadBlob(blob, filePath);

            Channel.FetchBlobAttributes(blob, accessCondition, requestOptions, OperationContext);
        }
        catch (Exception e)
        {
            WriteDebugLog(String.Format(Resources.DownloadBlobFailed, blob.Name, blob.Container.Name, filePath, e.Message));
            throw;
        }

If you look at the code above, it first downloads the blob DownloadBlob and the tries to fetch blob attributes Channel.FetchBlobAttributes. I haven't looked at the source code for Channel.FetchBlobAttributes function but I suspect it is making one more request to the server.

So to download a single blob, essentially the code is making 3 requests to the server which could be the reason for slowness. To be certain, you could trace your requests/response through Fiddler and see how exactly the cmdlet is interacting with storage.

Upvotes: 2

crthompson
crthompson

Reputation: 15865

Check out Blob Transfer Utility. It uses the Azure api, and its a good bet that is what Azure Explorer is using as well. BTU is open source so it would be much easier to test if its the cmdlet that is the problem.

Upvotes: 0

Related Questions