Reputation: 976
I am using Databricks bundles I have a dev and prod environment. I have a yaml that looks something like this:
# yaml-language-server: $schema=bundle_config_schema.json
bundle:
name: baby-names
resources:
tasks:
- task_key: retrieve-baby-names-task
existint_cluster: 1234
notebook_task:
notebook_path: ./retrieve-baby-names.py
targets:
development:
workspace:
host: <workspace-url>
production:
workspace:
host: <workspace-url>
This works great if you have the same cluster id in multiple environments, I don't I see Jinja is not supported. How can I set some logic that lets me deploy to env A with cluster id related to that env? vs b and its cluster id. this seems fundamental.
I have tried manually copy and pasting the new ID's which isn't what I want to do.
Upvotes: 1
Views: 1185
Reputation: 31
You can retrieve the cluster id with a lookup variable block, using the name of the cluster. Every time you target an specific environment by running databricks bundle with the flag -t you will obtain the ID for the cluster that matches the name provided by you. Looks like this.
variables:
cluster_id:
description: Cluster ID for the given name
lookup:
cluster: "<cluster_name>"
You can use this variable using interpolation provided by Databricks Asset Bundles: ${var.shared_cluster_id}
Upvotes: 3
Reputation: 976
The best solution I found was to use the Jinja package in python and in my build tool have a task that dynamically creates the yaml with the values for that environment.
Upvotes: 1