Reputation: 1869
I have a docker file where I want to
I am able to install the CLI in the docker image, and I have a working python file that is able to submit the job to the Databricks API but Im unsure of how to configure my CLI within docker.
Here is what I have
FROM python
MAINTAINER nope
# Creating Application Source Code Directory
RUN mkdir -p /src
# Setting Home Directory for containers
WORKDIR /src
# Installing python dependencies
RUN pip install databricks_cli
# Not sure how to do this part???
# databricks token kicks off the config via CLI
RUN databricks configure --token
# Copying src code to Container
COPY . /src
# Start Container
CMD echo $(databricks --version)
#Kicks off Pythern Job
CMD ["python", "get_run.py"]
If I was to do databricks configure --token
in the CLI it would prompt for the configs like this :
databricks configure --token
Databricks Host (should begin with https://):
Upvotes: 4
Views: 2245
Reputation: 1
If you want to access databricks models/download_artifacts using hostname and access token like how you do on databricks cli
databricks configure --token --profile profile_name
Databricks Host (should begin with https://): your_hostname
Token : token
if you have created profile name and pushed models and just want to access the model/artifacts in docker using this profile
Add below code in the docker file.
RUN pip install databricks_cli
ARG HOST_URL
ARG TOKEN
RUN echo "[<profile name>]\nhost = ${HOST_URL}\ntoken = ${TOKEN}" >> ~/.databrickscfg
#this will created your .databrickscfg file with host and token after build the same way you do using databricks configure command
Add args HOST_URL and TOKEN in the docker build
e.g
your host name = https://adb-5443106279769864.19.azuredatabricks.net/
your access token = dapi********************53b1-2
sudo docker build -t test_tag --build-arg HOST_URL=<your host name> --build-arg TOKEN=<your access token> .
And now you can access your experiments using this profilename Databricks:profile_name
in the code.
Upvotes: 0
Reputation: 56
The number of personal access tokens per user is limited to 600 But via bash is easy echo "y $(WORKSPACE-REGION-URL) $(CSE-DEVELOP-PAT) $(EXISTING-CLUSTER-ID) $(WORKSPACE-ORG-ID) 15001" | databricks-connect configure
Upvotes: 0
Reputation: 166
It is not very secure to put your token in the DockerFile. However, if you want to pursue this approach you can use the code below.
RUN export DATABRICKS_HOST=XXXXX && \
export DATABRICKS_API_TOKEN=XXXXX && \
export DATABRICKS_ORG_ID=XXXXX && \
export DATABRICKS_PORT=XXXXX && \
export DATABRICKS_CLUSTER_ID=XXXXX && \
echo "{\"host\": \"${DATABRICKS_HOST}\",\"token\": \"${DATABRICKS_API_TOKEN}\",\"cluster_id\":\"${DATABRICKS_CLUSTER_ID}\",\"org_id\": \"${DATABRICKS_ORG_ID}\", \"port\": \"${DATABRICKS_PORT}\" }" >> /root/.databricks-connect
Make sure to run all the commands in a line using one RUN
command. Otherwise, the variable such as DATABRICKS_HOST
or DATABRICKS_API_TOKEN
may not properly propagate.
If you want to connect to a Databricks Cluster within a docker container you need more configuration. You can find the required details in this article: How to Connect a Local or Remote Machine to a Databricks Cluster
Upvotes: 0
Reputation: 87279
It's better not to do it this way for multiple reasons:
Instead it's just better to pass two environment variables to the container, and they will be picked up by the databricks
command. These are DATABRICKS_HOST
and DATABRICKS_TOKEN
as it described in the documentation.
Upvotes: 4
Reputation: 6482
When databricks configure
is run successfully, it writes the information to the file ~/.databrickscfg
:
[DEFAULT]
host = https://your-databricks-host-url
token = your-api-token
One way you could set this in the container is by using a startup command (syntax here for docker-compose.yml
):
/bin/bash -ic "echo '[DEFAULT]\nhost = ${HOST_URL}\ntoken = ${TOKEN}' > ~/.databrickscfg"
Upvotes: 0