CMJ
CMJ

Reputation: 131

Jinja with Databricks DAB SQL Project

I have a dab SQL project that I created based on the databricks templates for the default SQL project.

I would like to be able to create tables and views with the schema automatically based on the file location:

enter image description here

So I want all tables under 01_port to have schema name port, and 02_terminal to be terminal, and 03_dock to be dock. I then want subfolders like commercial to be dock_commercial etc.

I have tried to create a macro that will do this so whenever i want to create or replace a table or view, I just have to call this identifier macro:

CREATE OR REPLACE TABLE {{ source_identifier() }} AS 
SELECT
    *
FROM IDENTIFIER(CONCAT({{catalog}}, '.', 'gold', '.', 'vessel'))

But when I run my databricks bundle deploy, it simply doesn't know how to read the macro. It seems that it is running my SQL without rendering the jinja first.

I tried to include it in the databrick.yml file as a resource:

bundle:
  name: dna-port-dbx

include:  
  - resources/*.yml
  - src/macro/*.sql

It still doesn't include it and I get an error when running my job:

[PARSE_SYNTAX_ERROR] Syntax error at or near '{'. SQLSTATE: 42601 (line 2, pos 24)

== SQL == -- This query is used to create the vessel table of the port database CREATE OR REPLACE TABLE {{ source_identifier() }} AS ------------------------^^^

I can't seem to find any documentation on how to make sure that this will work.

{% macro create_or_replace_table() %}
  {% set file_name = this.name.split('.')[0] %}
  {% set path_parts = this.path.split('/') %}
  {% set schema_name = 
    path_parts | length > 2 
    ? (
      '01_port' in path_parts[0] 
      ? 'port_' ~ path_parts[1]
      : '02_terminal' in path_parts[0]
      ? 'terminal_' ~ path_parts[1]
      : '03_dock' in path_parts[0]
      ? 'dock_' ~ path_parts[1]
      : 'default_schema'
    )
    : (
      '01_port' in path_parts[0]
      ? 'port'
      : '02_terminal' in path_parts[0]
      ? 'terminal'
      : '03_dock' in path_parts[0]
      ? 'dock'
      : 'default_schema'
    )
  %}

  IDENTIFIER(CONCAT(${variables.catalog}, '.', {{ schema_name }}.{{ file_name }}))


{% endmacro %}

Upvotes: 0

Views: 55

Answers (1)

lukwybxy
lukwybxy

Reputation: 11

Try passing with :parameter instead of {{}}, so:

CREATE OR REPLACE TABLE IDENTIFIER(:source_identifier) AS 
SELECT
    *
FROM IDENTIFIER(CONCAT(:catalog, '.', 'gold', '.', 'vessel'))

Passing parameters also depends it you're working on notebooks .ipynb files or .sql files.

Let me know if it helped, I can elaborate some more on above, because SQL requires other SQL warehouse to be defined but notebook all_purpose cluster can also run SQLs with a hack.

Upvotes: 1

Related Questions