Xophmeister
Xophmeister

Reputation: 9219

Configuring the Azure AD Databricks SCIM application with Terraform

I am trying to create and configure the Azure Databricks SCIM Provisioning Connector, so I can provision users in my Databricks workspace from AAD.

Following these instructions, I can get it to work manually. That is, creating and setting up the application in Azure Portal works and my selected users synchronise in Databricks. (The process wasn't completely straightforward. A lot of fiddling, which I don't remember, with the provisioning setup was needed before it did anything.)

When I try to transpose this into Terraform, I'm not getting very far:

(Aside: I note that, in my Terraform-created application, if I proceed to manually set up the users and provisioning in Azure Portal, it doesn't seem to do anything. I may be being impatient: the "Provision on Demand" button does actually work, but the polled synchronisation is either not doing anything or being really slow.)

(Edit: An update on the aside: The polled provisioning -- set up manually on a Terraform-managed SCIM app -- has now run twice since I wrote this question. In which time, it has not synchronised the users I manually selected, but instead has decided to delete the "Provision on Demand" user in Databricks that I created earlier...)

Upvotes: 5

Views: 3344

Answers (2)

MeneerBij
MeneerBij

Reputation: 327

I know it's too late, but I wrote a blog while I was investigating the same thing...

Blog: Manage Azure Databricks SCIM with Terraform with concrete code example here: github gist

Upvotes: 0

Crypto
Crypto

Reputation: 65

I'm trying to solve this puzzle myself.

On 1: From my understanding, you can assign users and groups via role assignments through MS Graph. See first tf example here App role assignment for accessing Microsoft Graph,

And apply the described configs from Automate SCIM provisioning using Microsoft Graph, such as granting these permissions:

Application.ReadWrite.All
Application.ReadWrite.OwnedBy

On 2: It doesn't seem to be possible to feed the Workspace SCIM endpoint and Token in a programmatic way into the created Azure application "Azure Databricks SCIM Provisioning Connector", as these seem to be gallery app specific config parameters. So manual intervention needed for that option I'm afraid.

According to Databricks, a full provisioning automation of AAD SCIM is not possible. But the Terraform SCIM approach would be fully automatable. Example see:

// define which groups have access to a particular workspace
variable "groups" {
  default = {
    "AAD Group A" = {
      workspace_access      = true
      databricks_sql_access = false
    },
    "AAD Group B" = {
      workspace_access      = false
      databricks_sql_access = true
    }
  }
}

// read group members of given groups from AzureAD every time Terraform is started
data "azuread_group" "this" {
  for_each     = toset(keys(var.groups))
  display_name = each.value
}

// create or remove groups within databricks - all governed by "groups" variable
resource "databricks_group" "this" {
  for_each              = data.azuread_group.this
  display_name          = each.key
  external_id           = each.value.id
  workspace_access      = var.groups[each.key].workspace_access
  databricks_sql_access = var.groups[each.key].databricks_sql_access
}

// read users from AzureAD every time Terraform is started
data "azuread_user" "this" {
  for_each  = toset(flatten([for g in data.azuread_group.this : g.members]))
  object_id = each.value
}

// all governed by AzureAD, create or remove users from databricks workspace
resource "databricks_user" "this" {
  for_each     = data.azuread_user.this
  external_id  = each.value.id
  user_name    = each.value.user_principal_name
  display_name = each.value.display_name
  active       = each.value.account_enabled
}

// put users to respective groups
resource "databricks_group_member" "this" {
  for_each = toset(flatten(
    [
      for group_name in keys(var.groups) :
      [
        for member_id in data.azuread_group.this[group_name].members :
        jsonencode({
          user : member_id,
          group : group_name
        })
      ]
  ]))
  group_id  = databricks_group.this[jsondecode(each.value).group].id
  member_id = databricks_user.this[jsondecode(each.value).user].id
}

Upvotes: 0

Related Questions