Reputation: 139
I am trying to copy a file to Azure Databricks DBFS through Azure Devops pipeline. The following is a snippet from the yml file I am using:
stages:
- stage: MYBuild
displayName: "My Build"
jobs:
- job: BuildwhlAndRunPytest
pool:
vmImage: 'ubuntu-16.04'
steps:
- task: UsePythonVersion@0
displayName: 'Use Python 3.7'
inputs:
versionSpec: '3.7'
addToPath: true
architecture: 'x64'
- script: |
pip install pytest requests setuptools wheel pytest-cov
pip install -U databricks-connect==7.3.*
displayName: 'Load Python Dependencies'
- checkout: self
persistCredentials: true
clean: true
- script: |
echo "y
$(databricks-host)
$(databricks-token)
$(databricks-cluster)
$(databricks-org-id)
8787" | databricks-connect configure
databricks-connect test
env:
databricks-token: $(databricks-token)
displayName: 'Configure DBConnect'
- script: |
databricks fs cp test-proj/pyspark-lib/configs/config.ini dbfs:/configs/test-proj/config.ini
I get the following error at the stage where I am invoking the databricks fs cp command:
/home/vsts/work/_temp/2278f7d5-1d96-4c4e-a501-77c07419773b.sh: line 7: databricks: command not found
However, when I run databricks-connect test
, it is able to execute the command successfully. Kindly help if I am missing some steps somewhere.
Upvotes: 3
Views: 3043
Reputation: 87069
The databricks
command is located in the databricks-cli
package, not in the databricks-connect
, so you need to change your pip install
command.
Also, for databricks
command you can just set the environment variables DATABRICKS_HOST
and DATABRICKS_TOKEN
and it will work, like this:
- script: |
pip install pytest requests setuptools wheel
pip install -U databricks-cli
displayName: 'Load Python Dependencies'
- script: |
databricks fs cp ... dbfs:/...
env:
DATABRICKS_HOST: $(DATABRICKS_HOST)
DATABRICKS_TOKEN: $(DATABRICKS_TOKEN)
displayName: 'Copy artifacts'
P.S. Here is an example on how to do CI/CD on Databricks + notebooks. You could be also interested in cicd-templates project.
Upvotes: 3