Reputation: 9219
I am trying to create and configure the Azure Databricks SCIM Provisioning Connector, so I can provision users in my Databricks workspace from AAD.
Following these instructions, I can get it to work manually. That is, creating and setting up the application in Azure Portal works and my selected users synchronise in Databricks. (The process wasn't completely straightforward. A lot of fiddling, which I don't remember, with the provisioning setup was needed before it did anything.)
When I try to transpose this into Terraform, I'm not getting very far:
I can create the application with Terraform, using the same Service Principal that created the Databricks Workspace resource:
data "azuread_application_template" "scim" {
display_name = "Azure Databricks SCIM Provisioning Connector"
}
resource "azuread_application" "scim" {
display_name = "${var.name}-scim"
template_id = data.azuread_application_template.scim.template_id
feature_tags {
enterprise = true
gallery = true
}
}
Similarly, I can create the Databricks access token for my Service Principal very easily:
resource "databricks_token" "scim" {
comment = "SCIM Integration"
}
Now I'm stuck:
azuread
resource that looks appropriate.(Aside: I note that, in my Terraform-created application, if I proceed to manually set up the users and provisioning in Azure Portal, it doesn't seem to do anything. I may be being impatient: the "Provision on Demand" button does actually work, but the polled synchronisation is either not doing anything or being really slow.)
(Edit: An update on the aside: The polled provisioning -- set up manually on a Terraform-managed SCIM app -- has now run twice since I wrote this question. In which time, it has not synchronised the users I manually selected, but instead has decided to delete the "Provision on Demand" user in Databricks that I created earlier...)
Upvotes: 5
Views: 3344
Reputation: 327
I know it's too late, but I wrote a blog while I was investigating the same thing...
Blog: Manage Azure Databricks SCIM with Terraform with concrete code example here: github gist
Upvotes: 0
Reputation: 65
I'm trying to solve this puzzle myself.
On 1: From my understanding, you can assign users and groups via role assignments through MS Graph. See first tf example here App role assignment for accessing Microsoft Graph,
And apply the described configs from Automate SCIM provisioning using Microsoft Graph, such as granting these permissions:
Application.ReadWrite.All
Application.ReadWrite.OwnedBy
On 2: It doesn't seem to be possible to feed the Workspace SCIM endpoint and Token in a programmatic way into the created Azure application "Azure Databricks SCIM Provisioning Connector", as these seem to be gallery app specific config parameters. So manual intervention needed for that option I'm afraid.
According to Databricks, a full provisioning automation of AAD SCIM is not possible. But the Terraform SCIM approach would be fully automatable. Example see:
// define which groups have access to a particular workspace
variable "groups" {
default = {
"AAD Group A" = {
workspace_access = true
databricks_sql_access = false
},
"AAD Group B" = {
workspace_access = false
databricks_sql_access = true
}
}
}
// read group members of given groups from AzureAD every time Terraform is started
data "azuread_group" "this" {
for_each = toset(keys(var.groups))
display_name = each.value
}
// create or remove groups within databricks - all governed by "groups" variable
resource "databricks_group" "this" {
for_each = data.azuread_group.this
display_name = each.key
external_id = each.value.id
workspace_access = var.groups[each.key].workspace_access
databricks_sql_access = var.groups[each.key].databricks_sql_access
}
// read users from AzureAD every time Terraform is started
data "azuread_user" "this" {
for_each = toset(flatten([for g in data.azuread_group.this : g.members]))
object_id = each.value
}
// all governed by AzureAD, create or remove users from databricks workspace
resource "databricks_user" "this" {
for_each = data.azuread_user.this
external_id = each.value.id
user_name = each.value.user_principal_name
display_name = each.value.display_name
active = each.value.account_enabled
}
// put users to respective groups
resource "databricks_group_member" "this" {
for_each = toset(flatten(
[
for group_name in keys(var.groups) :
[
for member_id in data.azuread_group.this[group_name].members :
jsonencode({
user : member_id,
group : group_name
})
]
]))
group_id = databricks_group.this[jsondecode(each.value).group].id
member_id = databricks_user.this[jsondecode(each.value).user].id
}
Upvotes: 0