INDERPAL SINGH
INDERPAL SINGH

Reputation: 11

Can we copy Azure blobs from one storage account to other storage accounts in parallel from same machine?

In Microsoft Azure, I have a source storage account in one region and 3 destination storage accounts in 3 different regions. I want to copy blob data from source storage account to all 3 destination storage accounts. Currently I am using the azcopy (version 6) command in a bash script to do it. First it completes for one region then starts for another. It takes almost an hour everyday due to the geographical distance between the regions. I wanted to know if azcopy has any option to copy blobs from source to multiple destinations in a parallel manner. Any other suggestions to reduce the time are also invited :)

Generalization of azcopy command being used in my bash script:

/usr/bin/azcopy --source https://[srcaccount].blob.core.windows.net/[container]/[path/to/blob] --source-key $SOURCE_KEY --destination https://[destaccount].blob.core.windows.net/[container]/[path/to/blob] --dest-key $DEST_KEY --recursive --quiet --exclude-older

Upvotes: 1

Views: 649

Answers (3)

user15117824
user15117824

Reputation:

If you are looking for Powershell script here it is:

Select-AzSubscription -Subscription "your subscription name"  # Set subscription id

# Variables
$SourceStorageAccount = "SourceStorageAccount"      # Replace with required Source Storage Account
$DestStorageAccount = "DestStorageAccount"  # Replace with required Destination Storage Account
$SourceResourceGroupName = "SourceResourceGroupName"    # Replace with required source resource group name
$DestResourceGroupName = "DestResourceGroupName"        # Replace with required destination resource group name

Get the storage keys for both the source and destination storage accounts

$SourceStorageKey = Get-AzStorageAccountKey -Name $SourceStorageAccount -ResourceGroupName $SourceResourceGroupName
$DestStorageKey = Get-AzStorageAccountKey -Name $DestStorageAccount -ResourceGroupName $DestResourceGroupName

$SourceStorageContext = New-AzStorageContext -StorageAccountName $SourceStorageAccount -StorageAccountKey $SourceStorageKey.Value[0]
$DestStorageContext = New-AzStorageContext -StorageAccountName $DestStorageAccount -StorageAccountKey $DestStorageKey.Value[0]

Loop for each container

$Containers = Get-AzStorageContainer -Context $SourceStorageContext foreach($Container in $Containers) {

$ContainerName = $Container.Name
if ($ContainerName -eq "snow-cmdb"){
    $ContainerName = $Container.Name+"copy3"
if (!((Get-AzureStorageContainer -Context $DestStorageContext) | Where-Object { $_.Name -eq $ContainerName }))
{   
    Write-Output "Creating new container $ContainerName"
    New-AzureStorageContainer -Name $ContainerName -Permission Off -Context $DestStorageContext -ErrorAction Stop
}

$Blobs = Get-AzureStorageBlob -Context $SourceStorageContext -Container $ContainerName
$BlobCpyAry = @() #Create array of objects

#Copy every thing
foreach ($Blob in $Blobs)
{
   $BlobName = $Blob.Name
   Write-Output "Copying $BlobName from $ContainerName"
   $BlobCopy = Start-CopyAzureStorageBlob -Context $SourceStorageContext -SrcContainer $ContainerName -SrcBlob $BlobName -DestContext $DestStorageContext -DestContainer $ContainerName -DestBlob $BlobName
   $BlobCpyAry += $BlobCopy
}

#Check Status
foreach ($BlobCopy in $BlobCpyAry)
{
   #Could ignore all rest and just run $BlobCopy | Get-AzureStorageBlobCopyState output with % copied
   $CopyState = $BlobCopy | Get-AzureStorageBlobCopyState
   $Message = $CopyState.Source.AbsolutePath + " " + $CopyState.Status + " {0:N2}%" -f (($CopyState.BytesCopied/$CopyState.TotalBytes)*100) 
   Write-Output $Message
}
}

}

Upvotes: 0

Glue Ops
Glue Ops

Reputation: 698

Just spawn a separate instance of your script for each destination. That way your copy will happen in parallel.

Here is a simple guide for doing this in BASH : https://www.slashroot.in/how-run-multiple-commands-parallel-linux

Upvotes: 0

silent
silent

Reputation: 16108

Azcopy can always only copy data from one source to one destination. But since you mention that you need to do this every day, I would probably go for a scheduled pipeline in Azure Data Factory instead. There you can also set up the three different copy jobs as parallel activities.

Upvotes: 0

Related Questions