Grevioos
Grevioos

Reputation: 405

How to deploy Databricks cluster with specified permissions?

I am deploying some Databricks clusters using powershell script which takes as an input json file with pre-defined cluster templates, for example:

{
    "cluster_name": "test1",
    "max_retries": 1,
    "spark_version": "5.3.x-scala2.11",
    "timeout_seconds": 3600,
    "autotermination_minutes": 60,
    "node_type_id": "Standard_DS3_v2",
    "driver_node_type_id": "Standard_DS3_v2",
    "spark_env_vars": {
      "PYSPARK_PYTHON": "/databricks/python3/bin/python3"
    },
    "spark_conf": {
      "spark.databricks.delta.preview.enabled": "true"
    },
    "autoscale": {
      "max_workers": 4,
      "min_workers": 2
    }
  }  

However, I would like to pre-assign to them some databricks permission groups. Can I do it using such cluster template? I cannot find any property that would allow me to specify those groups.

I can go to one of my clusters that has permissions assigned manually and export it as a json. However, in this case those are also missing from the template.

Thank you in advance!

Upvotes: 3

Views: 1133

Answers (2)

Midiparse
Midiparse

Reputation: 4781

The workaround that follows is so infinitely hacky, I wouldn't advise anyone to resort to this, if I knew another way. The workaround is to create a web session, log in, get a CSRF token, then issue a POST request to /acl/cluster/<cluster_id> with a map from user_ids to the requested permissions. Here's an example for setting all permissions on a single cluster for a single user (or group) using Python:

import json

import requests

DB_HOST = "db-cluster"
DB_USER = "user"
DB_PASS = "pass"

def change_acl(user_id, cluster_id):
    host = DB_HOST
    username = DB_USER
    password = DB_PASS
    session = requests.Session()
    login_request = session.post("https://{}/j_security_check".format(host),
                                 data={"j_username": username, "j_password": password})
    if login_request.status_code >= 400:
        raise Exception("login failed : {}".format(login_request.content))

    config_request = session.get("https://{}/config".format(host))

    if config_request.status_code >= 400:
        raise Exception("config request failed : {}".format(config_request.content))

    config = json.loads(config_request.content)
    csrf_token = config['csrfToken']

    acl_request = session.post(
        "https://{}/acl/cluster/{}".format(host, cluster_id),
        headers={
            "X-CSRF-Token": csrf_token,
            "Content-Type": "application/x-www-form-urlencoded; charset=UTF-8"
        },
        data=json.dumps({
            "type": "set",
            'permissions': {user_id: ["*"]}
        })
    )
    if acl_request.status_code >= 400:
        raise Exception("acl request failed : {}".format(acl_request.content))

If you find a better way, please let me know. The worst thing about this is you have to log in with username and password instead of a bearer token. The second worst thing is that this may break without any notice.

I hope the developers will find the time to implement this functionality in the near future.

Upvotes: 2

CHEEKATLAPRADEEP
CHEEKATLAPRADEEP

Reputation: 12768

Note: You cannot specify the permissions while creating a cluster using Clusters API . You should use "Group API" or "Admin Console"

Request structure of create cluster shown as follows:

enter image description here

Privileges can be granted to users or groups that are created via the groups API and Admin Console. Each user is uniquely identified by their username (which typically maps to their email address) in Databricks. Users who are workspace administrators in Databricks belong to a special admin role and can also access objects that they haven’t been given explicit access to.

Hope this helps.


If this answers your query, do click “Mark as Answer” and "Up-Vote" for the same. And, if you have any further query do let us know.

Upvotes: 0

Related Questions