Sai Pavan
Sai Pavan

Reputation: 11

Issue with Databricks Workspace Conversion: Python Files Automatically Converted to Notebooks upon Merge

I have a notebook that utilizes a Python file to import some dictionaries. Both the notebook and the .py file reside in the repository within the development workspace. However, after merging these files into the workspace, the .py file is automatically converted into a Databricks notebook instead of remaining as a Python file. The .py file contains only dictionaries.

To work around this issue, I manually created a global _dicts.py file in the workspace to ensure my script runs properly. For instance, I created a test.py file in the repository and merged it into the master branch. We use Azure DevOps as our CI/CD pipeline, and after a successful build, the files are merged into the workspace. However, the .py files deployed into the workspace are being changed into Databricks notebooks.test.py in the repo and in workspace from global_dicts import _transactionTypes from global_dicts import _methods from global_dicts import _statuses from global_dicts import _resources from global_dicts import _endDeviceType from global_dicts import _readingQuality

I tried editing the file manually and it worked I also saw an article that explained how to fix - I attached the link below-. However, it didn't help me to achieve what I was looking for.

https://docs.databricks.com/en/files/workspace.html#:~:text=To%20enable%20or%20re-enable%20support%20for%20non-notebook%20files,non-notebook%20files%20are%20already%20enabled%20for%20your%20workspace.

Upvotes: 0

Views: 676

Answers (1)

Alvin Zhao - MSFT
Alvin Zhao - MSFT

Reputation: 6147

I can reproduce the issue with databricks CLI as well.

databricks workspace import /Shared/TestFile-LocalZip.py --file TestFile-Local.zip --format SOURCE
databricks workspace import /Shared/TestFile-LocalPython.py --file TestFile-Local.py --language PYTHON

enter image description here

enter image description here

According to the document, it is more likely a limitation of Azure Databricks rather than an issue caused by Azure DevOps.

enter image description here

Since it says There is limited support for workspace file operations from serverless compute, I figured out a workaround with the help of My Personal Compute Cluster, where I could run curl command to copy file from Azure Storage Account into Azure Databricks Workspace as a File not a Notebook. Here are my steps.

Upload my .py file into Azure Storage Account blob and generate Blob SAS URL; enter image description here

Connect to My Personal Compute Cluster -> In the Terminal run the curl command;

BlobSASURL="https://xxxxxxxx.blob.core.windows.net/testcontainer/TestFile-Storage.py?xxxxxxxxx"

curl "$BlobSASURL" -o /Workspace/Shared/TestFile-Storage.py

enter image description here enter image description here

Upvotes: 0

Related Questions