Rohi_Dev_1.0
Rohi_Dev_1.0

Reputation: 386

Azure Databricks with Github

I am working with Databricks notebook and I synced it with GitHub. We are 2 members working on 2 different branches in Github repo. When we ran Azure Data Factory activity on that notebook, It ran the latest version of that notebook.

So whats the purpose of having GitHub as version control since we can't have control over Notebook version while executing from outside.

What If many developers commit their changes but at the EOD we need master branch changes to be executed which are most stable one.

Upvotes: 4

Views: 1159

Answers (2)

ferdyh
ferdyh

Reputation: 1445

We're actually not using the whole git sync on databricks, but are using the export_dir / import_dir functionality from databricks-cli. This way we have more control over what gets imported, and when.. And you can have commits over multiple notebooks (since one feature usually crosses more than one notebook).

Hopefully this helps.

Upvotes: 0

Wouter Dunnes
Wouter Dunnes

Reputation: 665

Databricks notebook does not reload from the git. You need to make a copy of the notebook in personal folder, develop and commit to git feature branch. After pull request into the main branch, you need to (re)deploy your notebooks from git.

The notebook which is running your code should not be altered, only the personal copy.

Upvotes: 1

Related Questions