Reputation: 1214
I want to try out cluster scoped init scripts on a Azure Databricks cluster. I'm struggling to see which commands are available.
Basically, I've got a file on dbfs that I want to copy to a local directory /tmp/config
when the cluster spins up.
So I created a very simple bash script:
#!/bin/bash
mkdir - p /tmp/config
databricks fs cp dbfs:/path/to/myFile.conf /tmp/config
Spinning up the cluster fails with "Cluster terminated. Reason: Init Script Failure". Looking at the log on dbfs, I see the error
bash: line 1: databricks: command not found
OK, so databricks
as a command is not available. That's the command I use on the local bash to copy files from and to dbfs.
What other commands are available to copy a file from dbfs? And more general: Which commands are actually available?
Upvotes: 2
Views: 3901
Reputation: 516
The dbfs is mounted to the clusters, so you can just copy it in your shell script:
e.g.
cp /dbfs/your-folder/your-file.txt ./your-file-txt
If you do a dir on the /dbfs location you get as a return all the folders/data you have in your dbfs.
You can also first test it in a notebook via
%sh
cd /dbfs
dir
Upvotes: 2
Reputation: 12768
By default, Databricks CLI is not installed on the databricks cluster. That's the reason you see this error message bash: line 1: databricks: command not found
.
To achieve this, you should use dbutils commands as shown below.
dbutils.fs.mkdirs("/tmp/config")
dbutils.fs.mv("/configuration/proxy.conf", "/tmp/config")
Reference: Databricks Utilities
Hope this helps.
Upvotes: 1