How to run a Databricks notebook with MLflow in Azure Data Factory pipeline?

Question

My colleagues and I are facing an issue when trying to run my Databricks notebook in Azure Data Factory. The error is coming from MLFlow.

The command that is failing is the following:

# Take the parent notebook path to use as path for the experiment
context = json.loads(dbutils.notebook.entry_point.getDbutils().notebook().getContext().toJson())
nb_base_path = context['extraContext']['notebook_path'][:-len("00_training_and_validation")]

experiment_path = nb_base_path + 'trainings'
mlflow.set_experiment(experiment_path)
experiment = mlflow.get_experiment_by_name(experiment_path)
experiment_id = experiment.experiment_id

run = mlflow.start_run(experiment_id=experiment_id, run_name=f"run_{datetime.now().strftime('%Y-%m-%d_%H-%M-%S')}")

And the error that is throwing is:

An exception was thrown from a UDF: 'mlflow.exceptions.RestException: INVALID_PARAMETER_VALUE: No experiment ID was specified. An experiment ID must be specified in Databricks Jobs and when logging to the MLflow server from outside the Databricks workspace. If using the Python fluent API, you can set an active experiment under which to create runs by calling mlflow.set_experiment("/path/to/experiment/in/workspace") at the start of your program.', from , line 32.

The pipeline just runs the notebook from ADF, it does not have any other step and the cluster we are using is type 7.3 ML.

Could you please help us?

Thank you in advance!

How to run a Databricks notebook with MLflow in Azure Data Factory pipeline?

Answers (1)

Related Questions