saul
saul

Reputation: 979

How to run a remote command (powershell/bash) against an existing Azure VM in Azure Data Factory V2?

I've been trying to find a way to run a simple command against one of my existing Azure VMs using Azure Data Factory V2.

Options so far:

The use case is the following:

There are a few tasks that are running locally (Azure VM) in task scheduler that I'd like to orchestrate using ADF as everything else is in ADF, these tasks are usually python applications that restore a SQL Backup and or purge some folders.

i.e. sqdb-restore -r myDatabase

where sqldb-restore is a command that is recognized locally after installing my local python library. Unfortunately the python app needs to live locally in the VM.

Any suggestions? Thanks.

Upvotes: 2

Views: 2328

Answers (2)

saul
saul

Reputation: 979

Thanks to @martin-esteban-zurita, his answer helped me to get to what I needed and this was a beautiful and fun experiment.

It is important to understand that Azure Automation is used for many things regarding resource orchestration in Azure (VMs, Services, DevOps), this automation can be done with Powershell and/or Python.

In this particular case I did not need to modify/maintain/orchestrate any Azure resource, I needed to actually run a Bash/Powershell command remotely into one of my existing VMs where I have multiple Powershell/Bash commands running recurrently in "Task Scheduler". "Task Scheduler" was adding unnecessary overhead to my data pipelines because it was unable to talk to ADF.

In addition, Azure Automation natively only runs Powershell/Python commands in Azure Cloud Shell which is very useful to orchestrate resources like turning on/off Azure VMs, adding/removing permissions from other Azure services, running maintenance or purge processes, etc, but I was still unable to run commands locally in an existing VM. This is where the Hybrid Runbook Worker came into to picture. A Hybrid worker group

These are the steps to accomplish this use case.

1. Create an Azure Automation Account

2. Install the Windows Hybrid Worker in my existing VM . In my case it was tricky because my proxy was giving me some errors. I ended up downloading the Nuget Package and manually installing it.

.\New-OnPremiseHybridWorker.ps1 -AutomationAccountName <NameofAutomationAccount> -AAResourceGroupName <NameofResourceGroup>
-OMSResourceGroupName <NameofOResourceGroup> -HybridGroupName <NameofHRWGroup>
-SubscriptionId <AzureSubscriptionId> -WorkspaceName <NameOfLogAnalyticsWorkspace>

Keep in mind that in the above code, you will need to find your own parameter values, the only parameter that does not have to be found and will be created is HybridGroupName this will define the name of the Hybrid Group

3. Create a PowerShell Runbook

[CmdletBinding()]
Param
([object]$WebhookData) #this parameter name needs to be called WebHookData otherwise the webhook does not work as expected.
$VerbosePreference = 'continue'

#region Verify if Runbook is started from Webhook.

# If runbook was called from Webhook, WebhookData will not be null.
if ($WebHookData){

    # Collect properties of WebhookData
    $WebhookName     =     $WebHookData.WebhookName
    # $WebhookHeaders  =     $WebHookData.RequestHeader
    $WebhookBody     =     $WebHookData.RequestBody

    # Collect individual headers. Input converted from JSON.
    $Input = (ConvertFrom-Json -InputObject $WebhookBody)
    # Write-Verbose "WebhookBody: $Input"
    #Write-Output -InputObject ('Runbook started from webhook {0} by {1}.' -f $WebhookName, $From)
}
else
{
   Write-Error -Message 'Runbook was not started from Webhook' -ErrorAction stop
}
#endregion

# This is where I run the commands that were in task scheduler

$callBackUri = $Input.callBackUri

 # This is extremely important for ADF
 Invoke-WebRequest -Uri $callBackUri -Method POST

4. Create a Runbook Webhook pointing to the Hybrid Worker's VM

enter image description here

enter image description here

4. Create a webhook activity in ADF where the above PowerShell runbook script will be called via a POST Method

Important Note: When I created the webhook activity it was timing out after 10 minutes (default), so I noticed in the Azure Automation Account that I was actually getting INPUT data (WEBHOOKDATA) that contained a JSON structure with the following elements:

  • WebhookName
  • RequestBody (This one contains whatever you add in the Body plus a default element called callBackUri)

All I had to do was to invoke the callBackUri from Azure Automation. And this is why in the PowerShell runbook code I added Invoke-WebRequest -Uri $callBackUri -Method POST. With this, ADF was succeeding/failing instead of timing out.

There are many other details that I struggled with when installing the hybrid worker in my VM but those are more specific to your environment/company.

Upvotes: 2

Martin Esteban Zurita
Martin Esteban Zurita

Reputation: 3209

This looks like a use case that is supported with Azure Automation, using a hybrid worker. Try reading here: https://learn.microsoft.com/en-us/azure/automation/automation-hybrid-runbook-worker

You can call runbooks with webhooks in ADFv2, using the web activity.

Hope this helped!

Upvotes: 1

Related Questions