user1948635
user1948635

Reputation: 1409

Databricks Workflow Output in Data Factory

I have seen a lot of articles explaining how to return the output from a databricks notebook in Data Factory. Is it possible to do this from a workflow?

I am new to Databricks, but the project I am working on calls a series of Databricks notebooks from ADF, every call is done via a databricks workflow, so notebooks are never called directly.

Essentially, this is what is happening now -

  1. Pipeline calls workflow and waits
  2. Workflow triggers a notebook
  3. Notebook code executes (python), returns a value with this code - dbutils.notebook.exit('test')
  4. Pipeline tries to get the output and fails with this error - "cannot be evaluated because property 'runOutput' doesn't exist"

Upvotes: 0

Views: 355

Answers (1)

Anupam Chand
Anupam Chand

Reputation: 2687

This is possible but since there are multiple tasks, the solution is a bit more complicated. The process should be as follows :

  1. Pipeline calls workflow and waits
  2. Workflow triggers a notebook
  3. Notebook code executes (python), returns a value with this code - dbutils.notebook.exit('test')
  4. Workflow may have another task which triggers the same or another notebook
  5. Notebook 2 code executes (python), returns a value with this code - dbutils.notebook.exit('test2')
  6. Pipeline gets the result using the jobs/runs/get API call. This will contain an array called “tasks” within which there will be multiple unique run_id, 1 for each task.
  7. Pipeline needs to loop through each task runid to get the individual outputs. You would use jobs/runs/get-output with run_id as a parameter. The run_id should be that of the task and not the job itself.

Upvotes: 1

Related Questions