Nikhil Ravindran
Nikhil Ravindran

Reputation: 414

Databricks Bearer Token Creation API Call from PowerShell Script Erroring out with 403 Error

Trying to create a CICD Pipeline using YAML method in Azure DevOps. The YAML file in turn uses a PowerShell script to generate a bearer token using the Azure Databricks API call to generate the bearer token. I am using a Service Principal to achieve this, but the step where the POST method is being called to the API is returning error as below:

Azure DevOps Output

Below is the PS script that I am trying to use:

param
(
    [parameter(Mandatory = $true)] [String] $databricksWorkspaceResourceId,
    [parameter(Mandatory = $true)] [String] $databricksWorkspaceUrl,
    [parameter(Mandatory = $true)] [String] $databricksOrgId,
    [parameter(Mandatory = $false)] [int] $tokenLifeTimeSeconds = 300
)

$azureDatabricksPrincipalId = '2ff814a6-3304-4ab8-85cb-cd0e6f879c1d'

$headers = @{}
$headers["Authorization"] = "Bearer $((az account get-access-token --resource $azureDatabricksPrincipalId | ConvertFrom-Json).accessToken)"
$headers["X-Databricks-Azure-SP-Management-Token"] = "$((az account get-access-token --resource https://management.core.windows.net/ | ConvertFrom-Json).accessToken)"
$headers["X-Databricks-Org-Id"] = $databricksOrgId
$headers["X-Databricks-Azure-Workspace-Resource-Id"] = $databricksWorkspaceResourceId

Write-Verbose $databricksWorkspaceResourceId
Write-Verbose $databricksWorkspaceUrl

$json = @{}
$json["lifetime_seconds"] = $tokenLifeTimeSeconds

$req = Invoke-WebRequest -Uri "https://$databricksWorkspaceUrl/api/2.0/token/create" -Body ($json | convertTo-Json) -ContentType "application/json" -Headers $headers -Method Post
$bearerToken = ($req.Content | ConvertFrom-Json).token_value

return $bearerToken

All the required parameters are being passed from the master YAML file.

The service principal that I am using is being granted contributor access to the required resource group and onto the Databricks as well. Also it has been given the API permission as well on to the AzureDatabricks.

Service Principal Permissions

Is it that the Service Principal is not being Granted Admin Consent for the API Permissions? Or is it because that the Service principal is not being given the "Owner" role assigned at the databricks level?

Could someone please help me figure out what the issue is?

Note: AAD Token, Management Access Token are being generated as desired by my PS script and being verified.

Upvotes: 0

Views: 581

Answers (1)

Bright Ran-MSFT
Bright Ran-MSFT

Reputation: 13944

You can try to check with the following things to resolve the issue:

  1. When using the 'az account get-access-token' command to generate the AAD Access Token, ensure the associated user (or identity) has been added as the Azure Databricks workspace admin on the workspace Admin Settings page. For your case, you need to assign the workspace admin role to the Service Principal.

  2. Use the Service Principal to create an ARM service connection (Azure Resource Manager service connection). Also see "Connect to Microsoft Azure".

  3. Then use an Azure CLI task with the ARM service connection in the pipeline to run the related PowerShell script to generate Databricks token for the Service Principal.

    variables:
      resourceID: 2ff814a6-3304-4ab8-85cb-cd0e6f879c1d
      workspaceUrl: {workspace Url}
    
    steps:
    - task: AzureCLI@2
      displayName: 'Create Databricks PAT'
      inputs:
        azureSubscription: {ARM service connection name}
        scriptType: pscore
        scriptLocation: inlineScript
        inlineScript: |
          Write-Host "Generate AAD Access Token for Azure Databricks service."
          $AAD_Token = (az account get-access-token --resource $(resourceID) --query "accessToken" --output tsv)
          Write-Host $AAD_Token
    
          Write-Host "Generate Azure Databricks PAT."
          $DatabricksPAT_response = (curl --request POST "$(workspaceUrl)/api/2.0/token/create" --header "Authorization: Bearer $AAD_Token" --data '{\"lifetime_seconds\": 600, \"comment\": \"This is an example token.\"}')
          $DatabricksPAT = $DatabricksPAT_response.token_value
          Write-Host $DatabricksPAT
    

Here is a similar case as reference.


In addition, you also can try to use the following Bash script to generate the token.

tenant_ID='xxxx'
client_id='xxxx'
client_secret='xxxx'
workspaceUrl='xxxx'

aad_token=$(curl -X POST -H "Content-Type: application/x-www-form-urlencoded" "https://login.microsoftonline.com/$tenant_ID/oauth2/v2.0/token" \
-d "grant_type=client_credentials&client_id=$client_id&client_secret=$client_secret&scope=2ff814a6-3304-4ab8-85cb-cd0e6f879c1d/.default" | jq -r '.access_token')
echo $aad_token

Databricks_pat=$(curl --request POST "$workspaceUrl/api/2.0/token/create" -H "Authorization: Bearer $aad_token" \
-d '{"lifetime_seconds": 600, "comment": "This is an example token."}' | jq -r '.token_value')
echo $Databricks_pat

EDIT:

You can follow the steps below to assign the workspace admin role to the Service Principal:

  1. Open the workspace Admin Settings page.

    enter image description here

  2. Add the Service Principal into the workspace.

    enter image description here

  3. Add the Service Principal to the admins group in the workspace.

    enter image description here


EDIT_2:

If your Azure Databricks workspace has some network restrictions (e.g., firewall or proxy server) set, when you run the related API/CLI to in Azure Pipelines to access the workspace, you generally need to add the IP addresses (or IP ranges) of the agent machines into the whitelist of your Azure Databricks workspace.

  1. If you are using MS hosted agents (Microsoft-hosted agents) to run the pipelines, you can add the service tags (AzureCloud.<region>) of all the possible MS hosted agents within the same Azure geography as your organization into the whitelist. For example, AzureCloud.centralus, AzureCloud.eastus, AzureCloud.westus, etc...

  2. If you are using self-hosted agents that installed on yourself own local machines or VMs, and the IP addresses of the machines will not change often, you can directly add the IP addresses of the machines into the whitelist.


Upvotes: 1

Related Questions